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Foreword 


If everyone were exposed to mathematics in its natural 
state, with all the challenging fun and surprises that that 
entails, I think we would see a dramatic change both in the 
attitude of students toward mathematics, and in our 
conception of what it means to be “good at math.” 

Paul Lockhart 


I’m really hungry for surprises because each one makes us 
ever-so-slightly but substantially smarter. 
Tadashi Tokieda 


Mathematics, when appropriately approached, can provide us with plentiful pleas- 
ant surprises. This is confirmed by a Google search of “mathematical surprises,” 
which, surprisingly, yields almost half a billion items. What is a surprise? The ori- 
gins of the word trace back to Old French with roots in Latin: “sur” (over) and 
“prendre” (to take, to grasp, to seize). Literally, to surprise is to overtake. As a noun, 
surprise is both an unanticipated or bewildering event or circumstance, as well as 
the emotion caused by it. 

Consider, for example, an extract from a lecture by Maxim Bruckheimer! on 
the Feuerbach circle: “Two points lie on one and only one straight line, this is no 
surprise. However, three points are not necessarily on one straight line and if, during 
a geometrical exploration, three points ‘fall into’ a straight line, this is a surprise and 
frequently we need to refer to this fact as a theorem to be proven. Any three points 
not on a straight line lie on one circle. However, if four points lie on the same circle, 
this is a surprise that should be formulated as a theorem. ... Insofar as the number 
of points on a straight line is larger than 3, so is the theorem the more surprising. 


1 Maxim Bruckheimer was a mathematician who was one of the founders of the Open University 
UK and Dean of its Faculty of Mathematics. He was Head of the Department of Science Teaching 
at the Weizmann Institute of Science. 
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Likewise, insofar as the number of points lying on one circle is larger than 4, so is 
the theorem the more surprising. Thus, the statement that for any triangle there are 
nine related points on the same circle ... is very surprising. Moreover, in spite of 
the magnitude of the surprise, its proof is elegant and easy.” 

In this book Mordechai Ben-Ari offers a rich collection of mathematical surprises, 
most of them less well known than the Feuerbach Circle and with sound reasons 
for including them. First, in spite of being absent from textbooks, the mathematical 
gems of this book are accessible with just a high school background (and patience, 
and paper and pencil, since fun does not come for free). Second, when a mathemat- 
ical result challenges what we take for granted, we are indeed surprised (Chaps. 1, 
13). Similarly, we are surprised by: the cleverness of an argument (Chaps. 2, 3), 
the justification of the possibility of a geometric construction by algebraic means 
(Chap. 16), a proof relying on an apparently unrelated topic (Chaps. 4, 5), a strange 
proof by induction (Chap. 6), new ways of looking at a well-known result (Chap. 7), 
a seemingly minor theorem becoming the foundation of a whole field of mathe- 
matics (Chap. 8), unexpected sources of inspiration (Chap. 9), rich formalizations 
emerging from purely recreational activities such as origami (Chaps. 10-12). These 
are all different reasons for the inclusion of the pleasant, beautiful and memorable 
mathematical surprises in this lovely book. 

So far I have addressed how the book relates to the first part of the definition of 
surprise, the cognitive rational reasons for the unexpected. As to the second aspect, 
the emotional aspect, this book is a vivid instantiation of what many mathematicians 
claim regarding the primary reason for doing mathematics: it is fascinating! More- 
over, they claim that mathematics stimulates both our intellectual curiosity and our 
esthetic sensibilities, and that solving a problem or understanding a concept provides 
a spiritual reward, which entices us to keep working on more problems and concepts. 

It has been said that the function of a foreword tell readers why they should read 
the book. I have tried to accomplish this, but I believe that the fuller answer will 
come from you, the reader, after reading it and experiencing what the etymology of 
the word surprise suggests: to be overtaken by it! 


Abraham Arcavi 


Preface 


Godfried Toussaint’s article on the “collapsing compass” [50] made a profound 
impression on me. It would never have occurred to me that the modern compass with 
a friction joint is not the one used in Euclid’s day. In this book I present a selection 
of mathematical results that are not only interesting, but that surprised me when I 
first encountered them. 

The mathematics required to read the book is secondary-school mathematics, but 
that does not mean that the material is simple. Some of the proofs are quite long and 
require that the reader be willing to persevere in studying the material. The reward 
is understanding of some of the most beautiful results in mathematics. The book 
is not a textbook, because the wide range of topics covered doesn’t fit neatly into 
a syllabus. It is appropriate for enrichment activities for secondary-school students, 
for college-level seminars and for mathematics teachers. 

The chapters can be read independently. (An exception is that Chap. 10 on the 
axioms of origami is a prerequisite for Chaps. 11, 12, the other chapters on origami.) 
Notes relevant to all chapters are given below in list labeled Style. 


What Is a Surprise? 


There were three criteria for including a topic in the book: 


e The theorem surprised me. Particularly surprising were the theorems on con- 
structibility with a straightedge and compass. The extremely rich mathematics of 
origami was almost shocking: when a mathematics teacher proposed a project on 
origami, I initially turned her down because I doubted that there could be any 
serious mathematics associated with the art form. Other topics were included 
because, although I knew the results, their proofs were surprising in their ele- 
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gance and accessibility, in particular, Gauss’s purely algebraic proof that a regular 
heptadecagon can be constructed. 

e The material does not appear in secondary-school and college textbooks, and I 
found these theorem and proofs only in advanced textbooks and in the research 
literature. There are Wikipedia articles on most of the topics, but you have to 
know where to look and the articles are often outlines. 

e The theorems and proofs are accessible with a good knowledge of secondary- 
school mathematics. 


Each chapter concludes with a paragraph What Is the Surprise? which explains my 
choice of the topic. 


An Overview of the Contents 


Chapter 1 presents Euclid’s proof that any construction that is possible with a fixed 
compass is possible with a collapsing compass. Many proofs have been given, but, as 
Toussaint shows, most are incorrect because they depend on diagrams which do not 
always correctly depict the geometry. To emphasize that one must not trust diagrams, 
I present the famous alleged proof that every triangle is isoceles. 

Over the centuries mathematicians unsuccessfully sought to trisect an arbitrary 
angle (divide it into three equal parts) using only a straightedge and compass. 
Underwood Dudley made a comprehensive study of trisectors who find incorrect 
constructions; most constructions are approximations that are claimed to be accu- 
rate. Chapter 2 starts by presenting two of these constructions and developing the 
trigonometric formulas showing that they are only approximations. To show that 
trisection using just a straightedge and compass is of no practical importance, trisec- 
tions using more complex tools are presented: Archimedes’s neusis and Hippias’s 
quadratrix. The chapter ends with a proof that it is impossible to trisect an arbitrary 
angle with a straightedge and compass. 

Squaring a circle (given a circle construct a square with the same area) cannot 
be performed using a straightedge and compass, because the value of z cannot be 
constructed. Chapter 3 presents three elegant constructions of close approximations 
to x, one by Kochański and two by Ramanujan. The chapter concludes by showing 
that a quadratrix can be used to square a circle. 

The four-color theorem states that it is possible to color any planar map with 
four colors, such that no countries with a common boundary are colored with the 
same color. The proof of this theorem is extremely complicated, but the proof of the 
five-color theorem is elementary and elegant, as shown in Chapter 4. The chapter 
also presents Percy Heawood’s demonstration that Alfred Kempe’s “proof” of the 
four-color theorem is incorrect. 
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How many guards must be employed by an art museum so that all the walls 
are under constant observation by at least one guard? The proof in Chapter 5 is 
quite clever, using graph coloring to solve what at first sight appears to be a purely 
geometrical problem. 

Chapter 6 presents some lesser-known results and their proofs by induction: 
theorems on Fibonacci numbers and Fermat numbers, McCarthy’s 91 function, and 
the Josephus problem. 

Chapter 7 discusses Po-Shen Loh’s method of solving quadratic equations. The 
method is a critical element of Gauss’s algebraic proof that a heptadecagon can be 
constructed (Chapter 16). The chapter includes al-Khwarizmi’s geometric construc- 
tion for finding roots of quadratic equations and a geometric construction used by 
Cardano in the development of the formula for finding roots of cubic equations. 

Ramsey theory is a topic in combinatorics that is an active area of research. It 
looks for patterns among subsets of large sets. Chapter 8 presents simple examples of 
Schur triples, Pythagorean triples, Ramsey numbers and van der Waerden’s problem. 
The proof of the theorem on Pythagorean triples was accomplished recently with 
the aid of a computer program based on mathematical logic. The chapter concludes 
with a digression on the ancient Babylonians’ knowledge of Pythagorean triples. 

C. Dudley Langford observed his son playing with colored blocks and noticed 
that he had laid them out in an interesting sequence. Chapter 9 presents his theorem 
on the conditions for such a sequence to be possible. 

Chapter 10 contains the seven axioms of origami, together with the detailed 
calculations of the analytic geometry of the axioms, and characterizations of the 
folds as geometric loci. 

Chapter 11 presents Eduard Lill’s method and the origami fold proposed by 
Margharita P. Beloch. I introduce Lill’s method as a magic trick so I won’t spoil it 
by giving details here. 

Chapter 12 shows that origami can perfom constructions not possible with a 
straightedge and compass: trisecting an angle, squaring a circle and constructing a 
nonagon (a regular polygon with nine sides). 

Chapter 13 presents the theorem by Georg Mohr and Lorenzo Mascheroni that 
any construction with a straightedge and compass can be performed using only a 
compass. 

The corresponding claim that a straightedge only is sufficient is incorrect, because 
a straightedge cannot compute lengths that are square roots. Jean-Victor Poncelet 
conjectured and Jakob Steiner proved that a straightedge is sufficient, provided that 
there exists a single fixed circle somewhere in the plane (Chap. 14). 

If two triangles have the same perimeter and the same area must they be congru- 
ent? That seems reasonable but it turns out not to be true, although it takes quite a 
bit of algebra and geometry to find a non-congruent pair as shown in Chap. 15. 
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Chapter 16 presents Gauss’s tour-de-force: a proof that a heptadecagon (a regular 
polygon with seventeen sides) can be constructed using a straightedge and compass. 
By a clever argument on the symmetry of the roots of polynomials, he obtained 
a formula that uses only the four arithmetic operators and square roots. Gauss did 
not give an explicit construction of a heptadecagon, so the elegant construction by 
James Callagy is presented. The chapter concludes with constructions of a regular 
pentagon based on Gauss’s method for the construction of a heptadecagon. 

To keep the book as self-contained as possible, Appendix A collects proofs of 
theorems of geometry and trigonometry that may not be familiar to the reader. 


Style 


e The reader is assumed to have a good knowledge of secondary-school mathemat- 
ics, including: 


— Algebra: polynomials and division of polynomials, monic polynomials—those 
whose coefficient of the highest power is 1, quadratic equations, multiplication 
of expressions with exponents a” - a” = a’™"*", 

— Euclidean geometry: congruent triangles AABC = ADEF and the criteria for 
congruence, similar triangles AABC ~ ADEF and the ratios of their sides, 
circles and their inscribed and central angles. 

— Analytic geometry: the cartesian plane, computing lengths and slopes of line 
segments, the formula for a circle. 

— Trigonometry: the functions sin, cos, tan and the conversions between them, 
angles in the unit circle, the trigonometric functions of angles reflected around 
an axis such as cos(180° — 6) = — cos 8. 


e Statements to be proved are called theorems with no attempt to distinguish between 
theorems, lemmas and corollaries. 

e When a theorem follows a construction, the variables that appear in the the- 
orem refer to labeled points, lines and angles in the figure accompanying the 
construction. 

e The full names of mathematicians have been given without biographical infor- 
mation that can be found easily in Wikipedia. 

e The book is written so that it is as self-contained as possible, but occasionally 
the presentation depends on advanced mathematical concepts and theorems that 
are given without proofs. In such cases, a summary of the material is presented 
in boxes which may be skipped. 

e There are no exercises but the ambitious reader is invited to prove each theorem 
before reading the proof. 
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e Geometric constructions can be studied using software such as Geogebra. 
e AB is used both for the name of a line segment and for the length of the segment. 
e AABC is used both for the name a triangle and for the area of the triangle. 
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Chapter 1 


The Collapsing Compass 


A 


Check for 
updates | 


A modern compass is a fixed compass: the distance between the two legs can be 
fixed so that it is possible to copy a line segment or a circle from one position to 
another (Fig. 1.1a). Euclid used a collapsing compass where a fixed distance cannot 
be maintained (Fig. 1.1b). Teachers often use a collapsing compass consisting of a 
marker tied to a string that is used to construct a circle on a whiteboard. Itis impossible 
to maintain a fixed length when the compass is removed from the whiteboard. 


` 


Fig. 1.la A fixed compass. One leg has a 
needle that is placed at the center of the circle. 
A pencil attached to the other leg is used to 
draw the circle. The legs are joined by a tight 
hinge so that the distance between the legs 
(the radius of the circle) is maintained even 
when the compass is lifted from the paper. 


T 


Fig. 1.1b A collapsing compass. The user 
holds a piece of string at the center of the 
circle. The other end of the string is tied to a 
pencil and is used to draw the circle. When the 
compass is lifted from the paper, the fingers 
(dashed) can easily slip to a new position. 


This chapter begins with a discussion of the relevance of studying construction 
with a straightedge and compass (Sect. 1.1). Section 1.2 compares the two types of 
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compasses in the most elementary construction: a perpendicular bisector. Section 1.3 
presents Euclid’s method of copying a line segment using a collapsing compass. This 
proves that any construction that can be done using a fixed compass can be performed 
using a collapsing compass. Section 1.4 shows a proof of this theorem which seems to 
be correct, but does not work for all configurations of lines and points. To emphasize 
that one must not trust diagrams, Sect. 1.5 presents a famous alleged proof that all 
triangles are isoceles; the proof appears to be correct but it is not because the proof 
is based on an incorrect diagram. 


1.1 Construction with a Straightedge and Compass 


Construction with a straightedge and compass used to be the fundamental concept 
taught in Euclidean geometry. Recently, it has fallen out of favor in school curricula. It 
is certainly true that the topic has little, if any, practical use. As we show in Sects. 2.2, 
2.3, 2.4, 3.4, the Greeks knew how to perform constructions that are impossible with 
a straightedge and compass by using tools only slightly more advanced. Today, using 
numerical methods, computers can perform constructions to any desired precision. 
Nevertheless, I believe that there are advantages to studying constructions: 


e Itis more fun and more challenging to learn geometry through constructions than 
simply to read theorems and proofs. 

e Significant breakthroughs in mathematics have been achieved by attempts to find 
constructions. Chapter 16 presents a construction by Gauss that led to modern 
abstract algebra, in particular, the theory developed by Evariste Galois. 

e Itis somewhat counterintuitive and therefore very interesting that it can be proved 
that it is impossible to construct some geometric objects. 

e Sadly, there a many people who waste years of their lives trying to perform 
impossible constructions. Students should certainly be aware of the futility of 
such efforts. 


1.2 Fixed Compasses and Collapsing Compasses 


Some geometry textbooks present the construction of a perpendicular bisector of a 
line segment by constructing two circles centered at the ends of the line segment such 
that the radii are equal and greater than half the length of the segment (Fig. 1.2a). 
This can only be done with a fixed compass because after drawing the circle centered 
at A, the distance between the legs of the compass needs to remain fixed to draw the 
circle centered at B. 


1.3 Euclid’s Construction for Copying a Line Segment 3 


C 
C 
A B 
A B 
D 
D 
Fig. 1.2a Construction of a perpendicular bi- Fig. 1.2b Construction of a perpendicular bi- 
sector with a fixed compass sector with a fixed or a collapsing compass 


Figure 1.2b shows the construction of a perpendicular bisector with either a fixed 
or a collapsing compass. Two circles are constructed: one centered at A with radius 
AB andone centered at B with radius BA. This can be done with a collapsing compass 
because (obviously) AB = BA, so the compass does not have to “remember” the 
length of AB to construct a circle centered at B with the same radius. The proof 
that the line constructed shown in Fig. 1.2a is a perpendicular bisector is not at all 
elementary because relatively advanced concepts like congruent triangles have to 
be used. However, the proof that the construction of a perpendicular bisector shown 
in Fig. 1.2b is correct is simple and based on the fact that AABC is an equilateral 
triangle. In fact, this is the first proposition in Euclid’s Elements. AC = AB since they 
are radii of the same circle and, similarly, BC = BA. We have: AC = AB = BA = BC. 

Figure 1.3a shows that for the construction with a fixed compass, the triangle will 


be an isosceles, not necessarily an equilateral, triangle (Fig. 1.3b). 


1.3 Euclid’s Construction for Copying a Line Segment 


The second proposition of Euclid’s Elements describes how to copy a given line 
segment AB to a segment of the same length, one of whose end points is a given 
point C. Therefore, a fixed compass adds no additional capabilities and a collapsing 
compass is sufficient, although constructions are easier with a fixed compass. 


Theorem 1.1 Given a line segment AB and a point C, a line segment CC’, one of 


whose endpoints is C, can be constructed using a collapsing compass, such that 
AB = CC’ (Fig. 1.4a). 
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C 
C 
A B 
A B 
D D 
Fig. 1.3a Construction of an isoceles triangle Fig. 1.3b Construction of an equilateral tri- 
with a fixed compass angle with a collapsing compass 


Proof Construct the line segment AC. Construct the equilateral triangle AACD 
whose base is AC (Fig. 1.4b). By Euclid’s first proposition, the triangle can be 
constructed using a collapsing compass. Construct the ray that is an extension of 
the line segment from D to A, and construct the ray that is an extension of the line 
segment from D to C (Fig. 1.5a). Construct the circle centered at A with radius AB 
and denote the intersection of the circle and the ray extending DA by E (Fig. 1.5b). 
Construct the circle centered at D with radius DE and denote the intersection of the 
circle and the ray extending DC by F (Fig. 1.6). 

DC = DA because AACD is equilateral. AE = AB are radii of the same circle, 
as are DF = DE. Therefore: 


CF = DF - DC = DE- DC = DE- DA=AE=AB. go 


The specification of the directions of the rays is essential. The proof here works 
for any line segment AB and any point C, regardless of its position relative to AB. By 
specifying directions the “cone” enclosed by the two rays will intersect the circles 
even if AC > AB (Fig. 1.7). 


C A 


Fig. 1.4a Copy the line segment AB. The Fig. 1.4b Copying a line segment with a col- 
orientation of CC” is not important. lapsing compass 
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D B B 


aS 


E 


Fig. 1.5a Constructing rays from D Fig. 1.5b Constructing a circle with radius 
AB 


1.4 A Flawed Construction for Copying a Line Segment 


Proof Construct three circles: one centered at A with radius AB, one centered 
at A with radius AC, and one centered at C with radius AC = CA. Denote the 
intersections of the circles centered at A and C by E and F, respectively, and denote 
an intersection of the circle centered at C and the circle centered at A with radius 
AB by D. If AC > AB, the construction is as shown in Fig. 1.8. 

Construct a circle centered at E with radius ED. Denote the intersection of this 
circle with the circle centered at A with radius AC by G. There are two intersections, 
so choose the one closer to C (Fig. 1.9). CD = CE are radii of the same circle as are 
AE = AG. By construction the radii CE and AE are equal. Therefore, 


CD =CE=AE=AG. 


EG = ED are radii of the same circle, so AEAG =% ADCE by side-side-side and 
LGEA = ZDEC. 


Fig. 1.6 Construction of CF = AB 


6 1 The Collapsing Compass 


Fig. 1.7 Construction for AC > AB 


Since: 
¿GEC = LGEA-ZCEA = ZDEC-/CEA = ZDEA, 


it follows that AADE = ACGE by side-angle-side. AB = AD are radii of the smaller 
circle centered at A, so GC = AD = AB. o 


The proof is correct only if AC > AB. Figure 1.10 shows a diagram where 
AC < AB and you can see that AB + GC. 


F 


Fig. 1.8 Construction for copying a line segment (1) 
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Fig. 1.9 Construction for copying a line segment (2) 


1.5 Don’t Trust a Diagram 


Theorem 1.2 (Incorrect, of course) All triangles are isosceles. 


Proof (Incorrect) Given an arbitrary triangle AABC, let P be the intersection of the 
angle bisector of ZBAC and the perpendicular bisector of BC. The intersections of 
the altitudes from P to the sides AB, AC are denoted by E, F, respectively (Fig. 1.11). 
AAPE = AAPF because they are right triangles with equal angles œ and common 
side AP. ADPB = ADPC since they are right triangles, PD is a common side and 
BD = CD =a. AEPB = AFPC since they are right triangles, EP = PF by the first 
congruence and PB = PC by the second congruence. By combining the equations 
we get that AABC is isoceles: 


AB = AE + EB = AF + FC = AC. Gi 


The logic of the proof is correct, but the diagram upon which the proof is based 
is not correct because point P is outside the triangle (Fig. 1.12). 


Le 


y | 


Fig. 1.10 A diagram for which the proof doesn’t work 


= 
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a 


D a 


Fig. 1.11 An incorrect proof that all triangles are isoceles 


What Is the Surprise? 


As a student I took it for granted that a compass has a friction joint that maintains 
the distance between the point and the pencil when it is lifted from the paper. When 
the teacher used a compass made from a piece of string and a piece of chalk, I never 
imagined that it differed from my compass. The article by Gotfried Toussaint was a 
real surprise, as was his demonstration that post-Euclid proofs were incorrect because 
they depended on diagrams that made unwarranted assumptions. I recommend the 
article to readers who wish to deepen their understanding of proofs in mathematics. 


Fig. 1.12 Why the construction doesn’t work 
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Sources 


This chapter is based on [50]. The incorrect construction of the equivalence of 
the two compasses in Sect. 1.4 is from [37]. A comprehensive English translation 
of Euclid’s Elements together with an extensive commentary [22] was written by 
Thomas L. Heath, one of the foremost experts in Greek mathematics. 
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Chapter 2 ordets 
Trisection of an Angle 


It is impossible to trisect an arbitrary angle (divide the angle into three equal parts) 
using only a straightedge and compass. Trisection requires the construction of cube 
roots, but a straightedge and compass can only construct lengths that are expressions 
built from integers, the four arithmetic operators and square roots. This was proved 
by Pierre Wantzel in 1837. Nevertheless, innumerable amateurs continue to attempt 
to trisect an angle. Their constructions are approximations though they are convinced 
that the constructions are correct. Section 2.1 presents two such constructions, de- 
velops formulas for the angles and shows the errors in the approximations. 

Greek mathematicians discovered that if other instruments are allowed, angles 
can be trisected. Section 2.2 explains a construction by Archimedes using a simple 
instrument called a neusis and Sect 2.3 shows how to double a cube using the neusis. 
Section 2.4 presents a construction for trisection by Hippias using an instrument 
called a quadratrix. The rest of the chapter contains a proof of the impossibility of 
trisecting an angle. Section 2.5 characterizes constructible numbers, Sect. 2.6 relates 
constructible numbers to roots of polynomials and Sect. 2.7 uses this theory to show 
that trisecting an angle and doubling a cube are impossible. 


2.1 Approximate Trisections 


2.1.1 First Approximate Trisection 


Construction: Let 6 = ZAOB be an arbitrary angle and without loss of generality 
assume that A, B are on a unit circle whose center is O. Bisect ZAOB and let C be 
the intersection of the bisector with the unit circle. Let D be the midpoint of OA and 
let T be the midpoint of the DC. Denote the angle ZDOT by ¢ (Fig. 2.1). 


© The Author(s) 2022 11 
M. Ben-Ari, Mathematical Surprises, https://doi.org/10.1007/978-3-03 1-13566-8 2 


12 2 Trisection of an Angle 


Fig. 2.1 First approximate trisection (1) 


Theorem 2.1 


2 sin(6/2) 
1+2cos(@/2) ` 


tan @ = 


Proof Figure 2.2 is extracted from Fig. 2.1 and contains additional annotations. 
Let CF be the perpendicular to OA that intersects OA at F. Since OC = 1, 
CF = sin(6/2) and OF = cos(@/2). Let TE be the perpendicular to OA that 
intersects OA at E. 
T is the midpoint of DC so DT = TC = a. But FT is the median to the hypotenuse 
of a right triangle, so FT = a and therefore ADT F is isoceles. It follows that TE is 
the both the median and the altitude of DF. From the diagram it is easy to see that: 


Compute the length 2a = CD using Pythagoras’s Theorem in ADCF: 


@ 1\7 6 
ae vt 2U 
(2a)~ = [cos 5 5) +sin 5° 
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sin(@/2) 


< cos( 0/2) > 
< (cos(@/2)-(1/2)) > 


<— (1/2) + (1/2) (cos(@/2) — (1/2)) —> 


Fig. 2.2 First approximate trisection (2) 


The length h = TE can be computed from Pythagoras’s Theorem in ADTE: 


h 
tan ġ = — = ———<—— = ——. 
0 
OE ee eae 1+2cos = 
2 2 2 2 2 


This is an approximation to a trisection ¢ = 6/3. For 6 = 60°: 


4 | 2 sin 30° 
tan 


-1 o o 
ear ee Í = 20.1° x 20°. 
1 +2 cos z) ee 0 j 


Table 2.1 shows the errors for a range of acute angles. The error is relatively small 
for small angles, rising to 1% at 85°. 
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Table 2.1 Errors in the first approximate trisection 


a(°) 0/3(°) tan™! (°) Error(°) Error(%) 


5 1.667 1.667 0.000 0.004 
10 3.333 3.334 0.000 0.014 
15 5.000 5.002 0.002 0.032 
20 6.667 6.670 0.004 0.057 
25 8.333 8.341 0.007 0.088 
30 10.000 10.013 0.013 0.128 
35 11.667 11.687 0.020 0.174 
40 13.333 13.364 0.030 0.228 
45 15.000 15.043 0.043 0.289 
50 16.667 16.726 0.060 0.358 
55 18.333 18.413 0.080 0.435 
60 20.000 20.104 0.104 0.520 
65 21.667 21.799 0.133 0.612 
70 23.333 23.500 0.166 0.713 
75 25.000 25.206 0.206 0.823 
80 26.667 26.918 0.251 0.941 
85 28.333 28.636 0.303 1.068 


2.1.2 Second Approximate Trisection 


Construction: Let 6 = ZAOB be an arbitrary angle and without loss of generality 
assume that A, B are on a unit circle whose center is O. Construct a circle of radius 
1/3 with center O and let D be its intersection with OA. Bisect ZAOB and let C be 
the intersection of the bisector with the circle of radius 1/3. Construct the chord CD 
and the chords AE = ET = CD. Since equal chords subtend equal central angles 
ZTOE = ZEOA = ¢ (Fig. 2.3). 


Theorem 2.2 
1 2.9 
cos¢@=1- g0 —cos(@/2)) = 1 — 9 sin (0/4). 
Proof By the Law of Cosines in ADOC: 


1 1 1 


2 2 2 1 2 
CD = (5] + (5) = (5] (5) cos(@/2) = rie — cos(6/2)). 


By the Law of Cosines in AEOA: 


AE = 17 +17-2-1-1-cos¢ =2(1 —cos¢). 
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Fig. 2.3 Second approximate trisection 


Equating the two expressions for CD = AE and simplifying we get: 
1 
cosġ = 1- z0 — cos(0/2)). 


Since cos 2e = cos? a — sin? a = 1 — 2 sin? a, and therefore 1 — cos 2a = 2 sin? Q, 


we have the alternate formula: 
2.3 
cos ġ = 1- 9 sin~(6/4) . 
This is an approximation to a trisection 2¢ = 6/3. For 6 = 60°: 
-1 1 o o o 
2 cos Pall -cos 30°) zx 19.8° = 20°. 


Table 2.2 shows the errors for a range of acute angles. This construction is much 
less accurate than the one in Sect. 2.1.1. 
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Table 2.2 Errors in the second approximate trisection 
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oC) -A/3(°) 


5 1.667 
10 3.333 
15 5.000 
20 6.667 
25 8.333 
30 10.000 
35 11.667 
40 13.333 
45 15.000 
50 16.667 
55 18.333 
60 20.000 
65 21.667 
70 23.333 
75 25.000 
80 26.667 
85 28.333 


cos™! 2 (°) 


1.667 
3.332 
4.997 
6.659 
8.319 
9.975 
11.626 
13.273 
14.914 
16.549 
18.177 
19.797 
21.408 
23.011 
24.603 
26.185 
27.756 


Error (°) 


0.000 
0.001 
0.003 
0.008 
0.015 
0.025 
0.040 
0.060 
0.086 
0.118 
0.156 
0.203 
0.258 
0.322 
0.397 
0.481 
0.577 


Error (%) 


0.007 
0.028 
0.063 
0.113 
0.176 
0.254 
0.346 
0.451 
0.571 
0.705 
0.853 
1.015 
1.192 
1.382 
1.586 
1.805 
2.038 


2.2 Trisection Using a Neusis 


The term straightedge is used instead of ruler because a straightedge has no marks 
on it. It can only be used to construct a straight line between two given points. 
Archimedes showed that a neusis, a straightedge with two marks that are a fixed 
distance apart, can be used to trisect an angle (Fig. 2.4). We define the distance 


between the marks to be 1. 


Construction: Let a = ZABE be an arbitrary angle in a unit circle with center B, 
where the radius of the circle equals the distance between the marks on the neusis. 
Extend the radius EB beyond the circle. Place an edge of the neusis on A and move it 
until it intersects the extension of EB at D and the circle at C, using the marks so that 
the length of the line segment CD is 1.! Construct the line AD. Denote CDB = B 


(Fig. 2.5). 


Fig. 2.4 A neusis 


1 This operation is called verging. 
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Fig. 2.5 The neusis construction for trisecting an angle (1) 


Theorem 2.3 8 = a/3. 


Proof Construct BC and denote the angles and line segments as shown in Fig. 2.6. 
AABC and ABCD are isoceles triangles: AB = BC are radii of the same circle and 
BC = CD by construction using the neusis. Since the sum of the angles of a triangle 
is equal to 180° and the sum of supplementary angles is also equal to 180°, we have: 


e = 180° - 28 

y = 180° -€ =28 

ô = 180° — 2y = 180° — 46 

a =180°-6-ß=3ß. o 


Fig. 2.6 The neusis construction for trisecting an angle (2) 
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2.3 Doubling the Cube with a Neusis 


Given a cube C construct another cube with twice its volume. If the volume of 
C is V its sides are of length WV. The sides of a cube with twice the volume are 
W2V = V2- VV. , SO if we can construct V2 we can double the cube. 

Construction: Construct the unit equilateral triangle AABC and extend CA with 
another unit line segment to D. Construct rays extending AB and DB. Place the 
neusis on point C and move it until one mark on the neusis is placed on the ray AB 
at P and the other mark is placed on the ray DB at Q. Denote CQ = x and BP = y 
(Fig. 2.7). 


Theorem 2.4 x = V2. 


Proof Since AABC is equilateral, cos ZCAP = cos 60° = 5 and by the Law of 
Cosines in AAPC: 


CP = AC +AP —2-AC- APcos60° (2.1a) 

1 
GIs a ly 2 eGeles (2.1b) 
x 4+2x =y +y. (2.10) 


By Menelaus’s theorem (Thm. A.20): 


Fig. 2.7 Doubing the cube with a neusis 
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Therefore: 
1 1 2 
ANST =] (2.2a) 
yx 1 
xy=2. (2.2b) 


Substituting Eq. 2.2b into Eq. 2.1c gives: 


x? 42x = ae 
x2 x 

xÍ 42x? =4+2x 

x3(x +2) =2(x +2) 


x= V2. o 


2.4 Trisection Using a Quadratrix 


Let ABCD be a square. Let l; be a line segment placed initially at DC and let l2 
be a line segment placed initially at AD. Move l; move at a constant linear velocity 
until it reaches AB and rotate / at a constant angular velocity clockwise on A until 
it also reaches AB. Assume that they reach AB together. For example, if lz rotates at 
1°/second and the side of the square is 9 centimeters, /; must move at 0.1 cm/second. 
The trace of their point of intersection P is called a quadratrix curve or simply a 
quadratrix (Fig. 2.8a). Its definition is attributed to the mathematician Hippias. 


A B A B 


Fig. 2.8a A quadratrix curve Fig. 2.8b A quadratrix compass 
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A quadratrix can be constructed using a quadratrix compass as shown in Fig. 2.8b. 
It consists of two (unmarked) straightedges that move as described above. A joint 
constrains them to move together and traces out the curve. 

A quadratrix can be used to trisect an angle. 
Construction: Let ZCDP, = a be an arbitrary angle, where P4 is the intersection 
of the line defining the angle æ relative to DC and the quadratrix. Construct a line 
through P; parallel to DC and denote its intersection with AD by E. Denote the line 
segment DE by t and trisect it (Sect. 2.5) to obtain point F that is t/3 from DC. Let 
P be the intersection of a line from F parallel to DC and the quadratrix, and denote 
by @ the angle between DC and DP) (Fig. 2.9). 


Theorem 2.5 6 = a/3. 


Proof E has y-coordinate 1 — t so by construction F has y-coordinate 1 — (t/3). 
Since the constant linear velocity of the horizontal line is proportional to the constant 
angular velocity of the rotating line 6/a@ = (t/3)/t and 6 = a@/3. o 


Fig. 2.9 Trisection of an angle using a quadratrix 
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2.5 Constructible Numbers 


Let / be a line segment defined to be of length 1. 


Definition 2.1 A number a is constructible if and only if a line segment of length a 
can be constructed with a straightedge and compass starting from /. 


Given line segment / = AB, construct a line containing AB and use the compass 
to find a point C on the line that is a distance of 1 from B. Then AC is of length 2 
so the number 2 is constructible. A line segment BD of length 1 can be constructed 
perpendicular to AB at B. The hypotenuse of the triangle AABD is of length V2 so 
the number V2 is constructible. 


Theorem 2.6 A number is constructible if and only if it is the value of an expression 
built from the integers, the four arithmetic operations {+, —, x, /} and the operation 
of taking a square root y. 


Proof First we show that the values of these expressions are constructible. 


Addition and subtraction: Given line segments PO = a and RS = b, construct a 
circle centered at Q with radius b (Fig. 2.10). Extend PO until it intersects the circle 
at U. Then PTQU is a line segment, where PT =a—band PU =a+b. 


P 
e 


| S 
< a > ig 
ee b 
< a+b 


Fig. 2.10 Construction of addition and subtraction 
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Ze Zo 


b C b'c 
e wo 
O O 
a D A a/b D A 
< ab > < a > 
Fig. 2.11a Construction of multiplication Fig. 2.11b Construction of division 


Multiplication: By similar triangles in Fig. 2.11a, (1/b) = (a/OA), so OA = ab. 
Division: By similar triangles in Fig. 2.11b, (1/b) = (OD/a), so OD = (a/b). 


Square roots: Given a line segment BC = a, construct AB = 1 + a and a semicircle 
with AB as its diameter. Construct a perpendicular at C and let D be the intersection 
of the perpendicular and the circle (Fig. 2.12). ZADB is a right angle because it is 
subtended by a diameter. By similar triangles (h/1) = (a/h), so h? = a and h = Ya. 


To prove the converse of the theorem, we need to determine what expressions can 
be constructed by a straightedge and compass. There are three constructions:? 


1. Two lines intersect in a point (Fig. 2.13a). The coordinates of the intersection can 
be derived from the equations of the two lines y = x and y = 4x — 2. The point of 
intersection is P = (2/3,2/3). 

2. A line intersects a circle in zero, one or two points (Fig. 2.13b). The coordinates of 
the intersections can be derived from the equations of the line y = x and the circle 
x? + y? = 4. The points of intersection are P = (V2, V2) and Q = (-V2, - V2). 


Fig. 2.12 Construction of a square root 


2 For clarity these are illustrated for specific values rather than the most general equations. 
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4 J 4 
3 3 
2 2 
P 
1 1 
-3 -J -1 1 2 3 4 -3 -ġ -ı 1 2 3 4 
-1 |/ Q -1 
-2 -2 
7 -3 
Fig. 2.13a The point of intersection of two Fig. 2.13b The points of intersection of a line 
lines and a circle 


3. Two circles intersect in zero, one or two points (Fig. 2.14). The coordinates of the 
intersections can be derived from the equations of the two circles (x—1)?+y? = 4, 
(x + 1)? + y? =4. The points of intersection are P = (0, V2), Q = (0,-V2). o 


Fig. 2.14 The points of intersection of two circles 
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2.6 Constructible Numbers As Roots of Polynomials 


To show that a number is not constructible, we need to prove that it cannot be 
expressed using just integers and the operations {+, —, x, /, y}. 

We will show that constructible numbers are the roots of a certain class of 
polynomials and then prove that trisecting an angle and doubling a cube require the 
construction of roots of polynomials that are not in this class. Today these results 
are proved using field theory from abstract algebra, but here I give a proof that uses 
elementary mathematics. The proof is based on the following definition. 


Definition 2.2 The depth of an expression built from the integers and the operators 


{+, —, x, /, y} is the maximum level of nesting of square roots. 


Example 2.1 Consider the following expression: 


174 3V17 - 434- 2V17 - 2434 + 2VI7. 


The depth is 3 because at the right of the expression we have V17 which is nested 


within V34 + 2V17, which in turn is nested within V17 +: — +- — 2434 + 2 V17. 


Theorem 2.7 A expression of depth n can be expressed as a + byc where a, b, c are 


expressions of depth at most n — 1. 


Proof Simple computations show that the expressions (a, + b Vc) op (a2 + bayc) 
for the operators op = {+, —, x} result in expressions a + byc of depth n — 1. For 
division the computation is a bit more complicated: 


ai + biye z (a, + bi yc) (az — bay c2) 
ay+boVe (ax + byc) (az - bac) 


aiaz — bbc a2bı — aıb2 


= + ve, 


2_ 72 2_ 72 
as b5c as b5c 


which is of the form a + byc and of depth n — 1. Finally, the square root of an 
expression of depth n — 1 is an expression of depth n. o 


Theorem 2.8 Let p(x) be a monic cubic polynomial with rational coefficients: 
Su 2 
p(x) =x" +ax" +a\x+a0, 


and let r = a + byc be a root of p(x) of minimal depth n, where a, b, c are of depth 
(at most) n — 1. Then r’ = a — byc is a root of p(x) andr + r'. 
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Proof Let us compute p(r) which is equal to 0 since r is a root: 
(a+ byc)? + ax(a + byc)? + a; (a + byc) + ao 
(a? + 3a? byc + 3ab?c + becyc) 
+ay(a* + 2abyc + b?c) + ai (a + byc) + ag 


(a> + 3ab*c + ana” + a2b?c + aja + ao) 
+ (3a*b + b?’c + 2anab + a,b) Ve = 
d+evc =0. 


where d, e are expressions of depth n — 1 formed from the rational coefficients and 
a,b,c. Then yc = —d/e, so a + byc can be expressed as an expression of depth 
n — 1, contracting the assumption that a + byc is of minimal depth n. Since yc + 0 
and is of depth n, for d + ec to be zero it must be that d = e = 0. 

Consider now r’ = a — byc. By examining the above computation we see that 
p(r’) =d-evc =0+0- ve =0, sor’ is also a root of p. 

Ifr =r’ then 0 =r — r’ = 2byc, which is true only if b = 0 so r,r’ would be of 
depth n — 1, again contradicting the assumption. o 


Theorem 2.9 [fa monic cubic polynomial with rational coefficients: 
ee) 2 
p(x) =x" +a2x^ + a,x + ao 
has no rational roots then none of its roots is constructible. 


Proof By the Fundamental Theorem of Algebra (Thm. 16.1) p(x) has three roots 
r1, r2,r3. Let rı = a + byc be a root of minimal depth n. By the assumption that 
there are no rational roots, n > 1, and therefore b + 0 and c + 0. By Thm. 2.8, 
r2 =a — byc is also a root. Perform the following multiplication: 


(x = ri) (x = r2)(x = r3) = x? — (r1 + r2 + r3)x? (2.3a) 
+ (r1r2 + r1r3 +r2r3)X + r1r2r3 (2.3b) 

az = —(r1 +r2 + r3) (2.3c) 

r3 = —-(a2 +r, +r2). (2.3d) 


Since az is rational so is: 
r3 = —d2 — (rı +2) = —a2 — 2a, 


contradicting the assumption. o 


26 2 Trisection of an Angle 


2.7 Impossibility of the Classical Constructions 


Theorem 2.10 V2 is irrational. 


Proof Assume that V2 is rational and equal to p/q where p, q are integers with no 
common factors other than +1. Then: 


(p/q)° = (V2)° 
p> = 293, 
so p must be divisible by 2, say p = 2r. Now: 
87° = 2q° 
g@=4r, 
so q is divisible by 2, contradicting the assumption that p, q have no common factor.O 


Theorem 2.11 x° — 2 has no rational roots so it is impossible to double a cube with 
a straightedge and compass. 


Proof One of its roots is V2 which by Thm. 2.10 is irrational. The other roots are 
the roots of the quadratic equation x? + V2x + (V2)? obtained by dividing x? — 2 by 
x — V2. It is easy to check that its roots are not rational (in fact, not even real). O 


Theorem 2.12 Jt is impossible to trisect an arbitrary angle with a straightedge and 


compass. 


Proof It is sufficient to show the impossibility for one angle. Let us try to trisect 60° 
to obtain 20°. By Thm. A.6: 


cos 3a@ = 4 cos? a — 3 cos œ 
cos 60° = 4 cos? 20° — 3 cos 20° . 


Denote x = cos 20° and 2x by y. Since cos 60° = 1/2 we have: 


E EG 
ie 

8x3 - 6x -1=0 

y -3y-1=0. 


To prove that the polynomial y3 —3y—1 has no rational roots suppose that y = a/b 
is a rational root with a, b having no common factor other than +1. Then: 
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(a/b — 3(a/b) -1 =0 (2.4a) 
a> —3ab* = b? (2.4b) 

a(a — 3b’) = b’? (2.4c) 

a? = b(b* + 3ab) . (2.4d) 


By Eq. 2.4c, b must be divisible by a, and by Eq. 2.4d, a must be divisible by b, 
which is possible only if a = b = +1 and a/b = +1. By computation, y = a/b = 1 
and y = a/b = —1 are not roots of the polynomial. o 


An alternate way of proving the impossibility of the constructions is to use the 
following theorem which we present without proof. 


1 


Theorem 2.13 [fa monic polynomial p(x) = x" + an-1x"7 +++++ ao with integer 


coefficients has rational roots then it has integer roots. 
To show the impossibility of duplicating a cube we need to show that: 
3 
xX — 2 = (x = r2)(x - rı)(x — ro) 


has no integer roots. Since rorır2 = —2, all roots must divide 2, so the only possible 

integer roots are +1, +2. A quick computation shows that none of them are roots. 
To show the impossibility of trisecting an angle we need to show that y? — 3y — 1 

has no integer roots. An integer root must divide —1 but neither 1 nor —1 are roots. 


What Is the Surprise? 


Underwood Dudley has made an extensive study of what he calls “cranks” who 
waste years of their lives trying to trisect angles with a straightedge and compass. 
Not only do they delude themselves into thinking that this is possible, but, even 
worse, they think that a solution would be important. Of course, a solution would 
have no practical use, since tools such as the neusis and quadratrix can solve the 
problem exactly. The sheer number of such constructions is surprising, especially 
since many of them are clever and achieve good approximations. Computing the 
formulas associated with the constructions is an excellent exercise in trigonometry. 

It is also surprising that proofs of the impossibility of these geometric construc- 
tions are purely algebraic using properties of roots of polynomials. 
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Sources 


Wikipedia [51, 58, 62] is a good source for the constructions in this chapter. The 
two approximate trisections are from [15, pp. 67—68, 95-96]. The second example 
is attributed to the famous philosopher Thomas Hobbes. Both [31, pp. 48—49] and 
[15, pp. 6-7] discuss trisection using the quadratrix. The doubling of the cube using 
a neusis is taken from [14]. 

A rigorous treatment of constructibility can be found in textbooks on abstract 
algebra such as [17], which contains a general proof of the converse of Thm 2.6 in 
Sect. 32. Theorem 2.13 is Thm. 23.11 of [17]. A relatively accessible presentation 
of Wantzel’s proof can be found in [48]. My presentation of constructibility is based 
upon the presentations in [11, Chap. III] and [27]. 
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Chapter 3 si 
Squaring the Circle 


Squaring the circle, the construction of a square with the same area as a given circle, 
is one of the three construction problems that the Greeks posed but were unable to 
solve. Unlike trisecting the angle and doubling the cube, where the impossibility 
follows from properties of the roots of polynomials, the impossibility of squaring 
the circle follows from the transcendentality of z: it is not the root of any polynomial 
with rational coefficients. This is a difficult theorem that was proved in 1882 by Carl 
von Lindemann. 

Approximations to m ~ 3.14159265359 have been known since ancient times. 
Some simple but reasonably accurate approximations are: 


22 333 355 
7“ 3.142857, Toe ~ 3.141509, iB“ 3.141593. 


We present three constructions by a straightedge and compass of approximations 
to x. One is by by Adam Kochański (Sect. 3.1) and two are by Ramanujan (Sects. 3.2, 
3.3). Section 3.4 how to square the circle using the quadratrix. 

The following table shows the formulas for the lengths that are constructed, their 
approximate values, the difference between these values and the value of x and the 
error in meters that results if the approximation is used to compute the circumference 
of the earth given that its radius is 6378 km. 


Construction Formula Value Difference Error (m) 
m — 3.14159265359 - — 
Pe 40 5 
Kochański z7 2V3 3.14153333871 5.93 x 10 757 
i 355 7 
Ramanujan 1 TB 3.14159292035 2.67 x 10 3.4 
192\ "4 
Ramanujan 2 (> + = 3.14159265258 1.01 x 107° 0.013 
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3.1 Kochanski’s Construction 


Construction (Fig. 3.1): 


1. Construct a unit circle centered at O, let AB be a diameter and construct a tangent 
to the circle at A. 

2. Construct a unit circle centered at A and denote its intersection with the first circle 
by C. Construct a unit circle centered at C and denote its intersection with the 
second circle is D. 

3. Construct OD and denote its intersection with the tangent by E. 

4. From E construct F, G, H, each at distance 1 from the previous point. 

5. Construct BH. 


Fig. 3.1 Kochański’s approximation to 7 


Theorem 3.1 BH = Vs -2V3 x x. 


Proof Figure 3.2 is an enlarged extract from Fig. 3.1, where dashed line segments 
have been added. Since all the circles are unit circles the lengths of the dashed lines 
are 1. It follows that AOCD is a rhombus so its diagonals are perpendicular to and 
bisect each other at the point labeled K. AK = 1/2. 


3.1 Kochaniski’s Construction 31 


Fig. 3.2 Detail from Kochariski’s construction 


The diagonal AC forms two equilateral triangles AOAC, ADAC so ZOAC = 60°. 
Since the tangent forms a right angle with the radius OA, ZKAE = 30°. Now: 


1/2 
1/2 _ oog39° = X2 
EA i 2 
Ae 

v3 


aH =3-EA=(3- d). 


AABH is aright triangle and AH = 3 — EA, so by Pythagoras’s Theorem: 


BH = AB +AA 
27 - 1 4 
+ 6V3 + = 0 ays 


=4 = 
3 3 


BH = Vs — 2V3 x 3.141533387 x 7. 
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3.2 Ramanujan’s First Construction 


Construction (Fig. 3.3): 


OMAN NMN PWN Fe 


— m 
pi o 


. Construct a unit circle centered at O and let PR be a diameter. 
. Construct the point H that bisects PO and the point T that trisects RO (Thm. 2.6). 
. Construct the perpendicular at T that intersects the circle at Q. 
. Construct the chords RS = QT and PS. 

. Construct a line parallel to RS from T that intersects PS at N. 
. Construct a line parallel to RS from O that intersects PS at M. 
. Construct the chord PK = PM. 

. Construct the tangent at P of length PL = MN. 

. Connect the points K, L, R. 

. Find point C such that RC is equal to RH. 

. Construct CD parallel to KL that intersects LR at D. 


Fig. 3.3 Ramanujan’s first construction 
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Theorem 3.2 RD = aces xn 


Proof RS = QT by construction and by Pythagoras’s Theorem for AQOT: 
= eed 2\7 V5 
RS = QT =4|1?- |=] ==. 
Npe = 


ZPSR is subtended by a diameter so APSR is a right triangle. By Pythagoras’s 
theorem: 


By construction MO || RS so AMPO ~ ASPR and: 


PM PS 
PO PR 
PM _ V31/3 
1 2 
aes I 
Pu =~ 


By construction NT || RS so ANPT ~ ASPR and: 


PN PS 

PT PR 

PN _ V31/3 

53 2 

pyp =- N31 
18 

EPE a EER 5 1\ {3I 

MN = PN - PM = V31(|— - -|= —_. 
y a| 5] 9 


APKR is aright triangle because ZPKR is subtended by a diameter. By construction 
PK = PM and by Pythagoras’s Theorem: 


ALPRis aright triangle because PL is a tangent so ZLPR is aright angle. PL = MN 
by construction and by Pythagoras’s Theorem: 
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2 
a-f] 2. 


By construction RC = RH = 3/2 and CD || LK. By similar triangles: 


RD RL 
RC RK 
RD _ V355/9 
3/2 113/6 
TEA 
113 
—2 355 


RD = — z 3.14159292 ZT. 
113 3 59292035 x m 


In Fig. 3.4 the line segments are labeled with their lengths. o 


PS=V31/3 
v31/9 


RD=¥V355/113 


v31/9 PR=V31/6 


RK=VI113/6 


Fig. 3.4 Ramanujan’s first construction with labeled line segments 


3.3 Ramanujan’s Second Construction 35 


3.3 Ramanujan’s Second Construction 


Construction (Fig. 3.5): 


1. 


NNW ND 


on 


Construct a unit circle centered at O with diameter AB and let C be the intersection 
of the perpendicular to AB at O with the circle. 


. Trisect the line segment AO such that AT = 1/3 and TO = 2/3 (Thm. 2.6). 

. Construct BC and find points M, N such that CM = MN = AT = 1/3. 

. Construct AM, AN and let P be the point on AN such that AP = AM. 

. From P construct a line parallel to MN that intersects AM at Q. 

. Construct OQ and then from T construct a line parallel to OQ that intersects AM 


at R. 


. Construct AS tangent to A such that AS = AR. 
. Construct SO. 


Fig. 3.5 Ramanujan’s second construction 
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192 1/4 

Theorem 3.3 3VSO = [> + 3 mT, 

Proof ACOB is a right triangle so by Pythagoras’s Theorem CB = V2 and: 


NB = V2 - 2/3. 


ACOB is isoceles so ZNBA = ZMBA = 45°. By the Law of Cosines: 


AN =BA +BN —2-BA-BN-cosZNBA 
2\7 OV. V 2 
EEEE 


2 9 
— |22 
AN =4|—. 
9 
Again by the Law of Cosines: 


AM =BA +BM —2-BA-BM-cos/MBA 


-2+(va-3) -2-2.(v2-3)- 2-8 


3 2 9 
— [19 
AM =,/—. 
9 


By construction QP || MN so AMAN ~ AQAP, and by construction AP = AM: 


AQ AP AM 
AM AN AN 

AM 19/9 19 
AN 22/9 3V2. 


> 


AQ = 


By construction TR || OQ so ARAT ~ AQAO and: 


IR AT 
AQ AO 
a a E 


AO 3V22 1 9y22° 
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By construction AS = AR and AOAS is a right triangle because AS is a tangent. 
By Pythagoras’s Theorem: 


(oa) 


= 192 1/4 192 1/4 
3VSO =3 (1 a z) = (> + =| x 3,14159265258 x 7. 


In Fig. 3.6 the line segments are labeled with their lengths. o 


AM=19/9 


AQ=19/3V22 


Fig. 3.6 Ramanujan’s second construction with labeled line segments 
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3.4 Squaring a Circle Using a Quadratrix 


The quadratrix is described in Sect. 2.4. 

Let t = DE be the distance that the horizontal straightedge has moved down the 
y-axis, and let 0 be the corresponding angle between the rotating straightedge and 
the x-axis. Let P the position of the joint of the two straightedges. The locus of P is 
the quadratrix curve. 

Let F be the projection of P onto the x-axis and let G be the position of the 
joint when both straightedges reach the x-axis, that is, G is the intersection of the 
quadratrix curve and the x-axis (Fig. 3.7). 


D C 
t 
P 
E 
l-t y 
0 
A x F G B 


Fig. 3.7 Squaring the circle with a quadratrix 


Theorem 3.4 AG =2/z. 


Proof Let y = PF = EA = 1-1. Since ona quadratrix @ decreases at the same rate 
that ¢ increases: 


l1-t_ 90 

1 x/2 

T 
0=-(1-t). 
se) 


Let x = AF = EP. Then tan 6 = y/x so: 


= = yeotd = yeot (1-1) = yeot Sy. (3.1) 
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We usually express a function as y = f(x) but it can also be expressed as x = f(y). 

To obtain x = AG we can’t simply plug y = 0 into Eq. 3.1, because cot 0 is not 
defined, so let us compute the limit of x as y goes to 0. First perform the substitution 
z = (7 /2)y to obtain: 


T 


2 


m 2 (n 2 
x=ycot=y= ( ycot y) = —(zcotz), 
2 m \2 T 


and then take the limit: 


] 2ta 2 . (zcosz 2... COS Z 2cos0 2 
lim x = — lim (z cotz) = — lim = = = 


é = — lim | — , 
230 T z0 mz30\ sinz m z=0 \ (sin z)/z nx 1 T 
where we have used lim,_,9(sin z/z) = 1 (Thm. A.12). o 

What Is the Surprise? 


It is surprising that such accurate approximations to 7 can be constructed. Of course 
one can’t help but be astonished by Ramanujan’s clever constructions. 


Sources 


Kochanski’s construction appears in [7]. Ramanujan’s constructions are from [38, 
39]. Squaring the circle using the quadratrix is from [31, pp. 48-49] and [62]. 
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Check for 


Chapter 4 uodas) 
The Five-Color Theorem 


Maps use colors to distinguish one region from another by ensuring that adjacent 
regions are colored with different colors. In 1852 Francis Guthrie noticed that a 
map of the counties of England could be colored using only four countries. The 
claim that four countries suffice to color any planar map is called the four-color 
theorem and was only proved in 1976 by Kenneth Appel and Wolfgang Haken. They 
used sophisticated mathematical arguments to show that if there is a counterexample 
(a map needing more than four colors), it had to be associated with one of 1834 
configurations. They then used a computer to check these configurations. 

While the four-color theorem is extremely difficult to prove, the proofs of the five- 
and six-color theorems are relatively simple (Sects. 4.5, 4.6). On the way to proving 
these theorems, we define planar maps and graphs (Sect. 4.1), prove Euler’s formula 
(Sect. 4.2) and show that a planar graph must have vertex whose degree is less than 
or equal to five. In Sect. 4.3 Euler’s formula is used to show that two graphs are not 
planar. 

In 1879 Alfred B. Kempe published a proof of the four-color theorem, but in 
1890 Percy J. Heawood showed that the proof is incorrect. In Sect. 4.7 we present 
Kempe’s flawed proof and Heawood’s demonstration that it is not correct. 


4.1 Planar Maps and Graphs 


Definition 4.1 A planar map is a set of regions in the plane separated by boundaries. 
A coloring of a map is an assignment of a color to each region such that regions 
sharing a boundary are assigned different colors. 


Figure 4.1a shows a five-coloring of a planar map with ten regions. Figure 4.1b 
shows a four-coloring of the same map. 
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Fig. 4.la Five-coloring of a planar map Fig. 4.1b Four-coloring of a planar map 


Definition 4.2 A graph is a set of vertices V and a set of edges E, such that each 
edge is incident with exactly two vertices. 

A planar graph is a graph such that no edges cross each other. In a planar graph, 
areas enclosed by a set of edges are called faces. 

A coloring of a planar graph is an assignment of colors to vertices such that no 
two vertices of the same color are connected by an edge. 


Planar maps and planar graphs are dual and it is convenient to investigate coloring 
problems in graphs rather than maps. 


Theorem 4.1 Given a planar map, a planar graph can be constructed such that for 
each coloring of the regions of the map there is a coloring of the vertices of the 
graph, and conversely. 


Proof Construct one vertex for each region and construct an edge between two 
vertices if and only if the corresponding regions share a boundary. oO 


Example 4.1 Figure 4.2a shows the planar map from Fig. 4.1b and the vertices 
associated with the regions. Figure 4.2b shows the planar graph that corresponds to 
the map. 


We can further limit our graphs to those whose faces are triangular. 


Definition 4.3 A graph is triangular if all its faces are bounded by three edges. A 
graph can be triangulated if edges can be added so that the graph is triangular. We 
also say that there is a triangulation of the graph. 


Example 4.2 The faces in the planar graph in Fig. 4.2b are triangular since each one 
is bounded by three edges. The edges are curved so the faces are not triangles, which 
are polygons whose three edges are straight line segments. 
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Da 


Fig. 4.2a Associating vertices with the re- Fig. 4.2b The planar graph that corresponds 
gions of a planar map to the planar map 


Fáry’s Theorem states that any triangular planar graph can be be trans- 
formed into an equivalent planar graph whose edges are straight line seg- 
ments. Therefore, with no loss of generality, proofs can be restricted to 
planar graphs whose faces are triangles. 


Example 4.3 Fig. 4.3 (left) shows that a square can be two-colored, but if it is 
triangulated (center), four colors are necessary. Our goal is to prove that all graphs 
can be n-colored for some n. If the triangulated graph is n-colored, so is the original 
graph, because deleting the extra edges does not invalidate the coloring (right). 


Fig. 4.3 Coloring a triangulated graph 
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4.2 Euler’s Formula 


Theorem 4.2 Let G be a connected planar graph with V vertices, E edges and F 
faces. Then V — E+ F =2. 


Proof By induction on the number of edges. If the number of edges in the graph is 
zero, there is only a single vertex and a single face, so 1 -0+ 1 = 2. Otherwise, there 
is at least one edge e and it connects two vertices v1, v2. Delete edge e. 

Case 1: The graph becomes disconnected (Fig. 4.4a). Merge vı with v2 (Fig. 4.4b). 
The resulting graph G’ is a planar connected graph and has fewer edges than G, so 
by the induction hypothesis (V — 1) — (E — 1) + F = 2 since the number of vertices 
is also reduced by one. Simplifying, we get V — E + F = 2 for G. 


vI v2 V1, V2 


Fig. 4.4a Removing an edge disconnects the Fig. 4.4b Merging two vertices 
graph 


Case 2: The graph remains connected (Fig. 4.5a). G’ has fewer edges than G 
(Fig. 4.5b), so by the induction hypothesis V — (E — 1) + (F — 1) = 2 since removing 


the edge joins two faces into one. Simplifying, we get V — E + F = 2 for G. oO 
e 

Fig. 4.5a Removing an edge does not discon- Fig. 4.5b The graph remains connected and 

nect the graph has fewer edges 


Theorem 4.3 Let G be a connected, triangulated planar graph with E edges and V 
vertices. Then E = 3V — 6. 


Proof Each face is bounded by three edges, so E = 3F'/2, where we divided by 2 
because each edge has been counted twice, once for each face it bounds. By Euler’s 
formula: 
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Fig. 4.6a Fewer edges than the upper limit Fig. 4.6b In a triangulated graph the number 
of edges is maximal 


E=V+F-2 
=V+2E/3-2 
=3V-6. o 


Example 4.4 The planar graph in Fig. 4.2b has 10 vertices and 3 - 10 — 6 = 24 edges. 
Theorem 4.4 Let G be a connected planar graph. Then E < 3V — 6. 


Proof Triangulate G to obtain G’. E’ = 3V’ — 6 by Thm. 4.4. Now remove edges 
from G” to obtain G. The number of vertices does not change so E < 3V — 6. o 


Example 4.5 The graph in Fig. 4.6a has 8 edges and 6 vertices and 8 < 3-6-6 = 12. 
Figure 4.6b shows a triangulated graph with 6 vertices and 3 - 6 — 6 = 12 edges. 


4.3 Non-planar Graphs 


Let us take a short detour to show how Thms. 4.2 and 4.4 can be used to prove that 
certain graphs are not planar. 


Theorem 4.5 Ks, the complete graph on five vertices, is not planar (Fig. 4.7a). 
Proof For Ks, V = 5 and E = 10. By Thm. 4.4 the number of edges must be less 
than or equal to 3 - 5 — 6 = 9 so the graph is not planar. o 
Theorem 4.6 K3 3, the bipartite graph with three vertices on each side, is not planar 
(Fig. 4.8a). 

Proof V = 6 and E = 9. By Thm 4.2 if K3 3 is planar, F = E- V +2 =9-6+2=5. 
But each face is bounded by four edges (Fig. 4.8b), so E = 4F/2 = 10 #9. o 


In 1930 Kazimierz Kuratowski proved a converse to these theorems: if a graph is 
not planar it contains (in a certain sense) Ks or K3,3. 
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ee 


Fig. 4.7a Ks is not planar Fig. 4.7b A failed attempt to draw Ks as pla- 
nar 


4.4 The Degrees of the Vertices 


Definition 4.4 d(v), the degree of vertex v, is the number of edges incident with v. 


Example 4.6 The graph in Fig. 4.2b contains 8 vertices corresponding to the two 
rings and each vertex is of degree 5. The vertex corresponding to the outer face is of 
degree 4 as is the vertex corresponding to the inner face. Therefore: 


X dv) =5-84+4-2=48. 
veV 


To get the total number of edges divide 48 by 2 because each edge was counted 
twice, once for each of the vertices it is incident to. 


OS 


Fig. 4.8a K3 3 is not planar Fig. 4.8b A failed attempt to draw K3,3 as 
planar 
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By generalizing the argument we get: 


Theorem 4.7 Let d; for i in {1,2,3,...,k} be the number of vertices of degree i in 
a connected planar graph G with V vertices and E edges, where k is the highest 
degree of a vertex in V. Then: 


k 


X dQ) = lid = 28. 


veV i=1 


Theorem 4.8 Let G be a connected planar graph with E edges and V vertices, and 
let di for i in {1,2,3,...,k} be the number of vertices of degree i, where k is the 
highest degree of a vertex in V. Then there must be a vertex v in V such that d(v) < 5. 


Proof (1) If there are d; vertices of degree 1, dz vertices of degree 2, . . . , dx vertices 
of degree k, then V = ya di. From Thms. 4.4 and 4.7: 


k 
i- dj =2E < (3V - 6) =6V -12=6 Jdi - 12. 


i=1 i=l 


Therefore: 


Si-d <6 514-12 
i=1 i=l 


k 
X (6-i)d; > 12. 


i=l 
Since 12 > 0 and all d; are positive, for least one 7, 6 — i > O and for that i, i < 6.0 


Proof (2) Let us compute the average degree of the vertices which is the sum of the 
degrees divided by the number of vertices: 


Dki di 


dayg = V 


But the sum of the degrees is twice the number of edges which by Thm. 4.4 gives: 


2E  6V-12 6 
davg = < =6-—<6. 
avg V V V 
If the average is less than six there must be a vertex of degree less than six. o 


Example 4.7 In Fig. 4.2b the sum of the degrees is 8 - 5 + 2 -4 = 48. There are 10 
vertices so the average degree is 48/10 = 4.8 and there must be a vertex of degree 4 
or less. 
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4.5 The Six-Color Theorem 


Theorem 4.9 Any planar graph G can be six-colored. 


Proof By induction on the number of vertices. If G has six vertices or fewer, six 
colors suffice. For the inductive step, by Thm. 4.8 G has a vertex v with degree 5 or 
fewer. Delete vertex v to obtain the graph G’. By the induction hypothesis G’ can 
be six-colored, but v has at most 5 neighbors and at most 5 colors are used to color 
them (Fig. 4.9a), so v can be colored using the sixth color (Fig. 4.9b). oO 


Fig. 4.9a Five colors suffice for coloring the Fig. 4.9b Color v with the sixth color 
neighbors of v 


4.6 The Five-Color Theorem 


Definition 4.5 Let G be a colored planar graph. A (Kempe) chain G’ is a maximal, 
two-colored, connected subgraph of G. 


Theorem 4.10 Any planar graph G can be five-colored. 


Proof By induction on the number of vertices. If G five vertices or fewer, five colors 
suffice. For the inductive step, by Thm. 4.8 G has a vertex v with degree 5 or less. 
Delete v to obtain G’. By the induction hypothesis, G’ can be five-colored. In G, if 
the degree of v is less than 5, or if vj,..., v5, the neighbors of v, are colored with 
four colors or fewer, v can be colored with the fifth color. Otherwise, vj,...,v5 are 
colored with different colors in G’ (Fig. 4.10, top). 


4.6 The Five-Color Theorem 


Fig. 4.10 Proof of the five-color theorem 
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Consider vertex vı which is colored blue and vertex v3 which is colored red. If 
vı, v3 are not connected by a blue-red path (say if the edge vev7 did not exist), we 
can exchange the colors along the path from vı to ve and color v blue. Otherwise, 
consider the blue-red chain which contains v1, v3. By adding v and the edges vv1, vv3 
we obtain a closed path P (double line) that divides the plane into an “inside” region 
and an “outside” region (Fig. 4.10, middle) 

Consider v2 which is colored green and v4 which is colored orange. These vertices 
cannot be contained in a single green-orange chain, because v2 is inside P and v4 is 
outside P, so any path connecting them must cross P, contradicting the assumption 
that the graph is planar. Therefore, they must be contained in two unconnected green- 
orange chains (double dashed line, in Fig. 4.10, middle). Exchange the colors on the 
chain containing v2 and then v can be colored green to obtain a five-coloring of G 
(Fig. 4.10, bottom). oO 


The statement that a continuous path from the inside of of a closed con- 
tinuous curve P to the outside of P must intersect P is the Jordan Curve 
Theorem. The theorem is intuitively obvious but difficult to prove. 


4.7 Kempe’s Incorrect Proof of the Four-Color Theorem 


Theorem 4.11 Any planar graph G can be four-colored. 


Proof (Incorrect) The base case of the induction and most of the proof is the same 
as that of the five-color theorem. The new case that must be considered is a vertex 
v with five neighbors which, by the inductive hypothesis, can be colored with four 
colors after removing v. 

In Fig. 4.1 1a there are two vertices v2, v5 colored blue. Consider the blue-green 
chain containing v2 and the blue-yellow chain containing vs. The blue-green chain 
is contained within the closed path defined by the red-yellow chain containing v1, v3 
(double line) and the blue-yellow chain in contained within the closed path defined 
by the red-green chain containing v;, v4 (double dashed line). 

Exchange the colors of both the blue-green chain and the blue-yellow chain 
(Fig. 4.11b). The result is that the neighbors of v are colored with the three colors 
red, green and yellow, leaving blue free to color v. o 


Heawood noted that the closed paths defined by the red-yellow chain and the 
red-green chain can share red vertices (vı, vg in Fig. 4.12a). When the colors are 
exchanged in the blue-green and blue-yellow chains, it is possible for blue vertices 
v6, v7 to be connected (Fig. 4.12b) and the coloring is no longer correct. 
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Fig. 4.11a Blue-green and blue-yellow Fig. 4.11b Exchange the colors of the two 
Kempe chains Kempe chains 


Fig. 4.12a Red-yellow and red-green chains Fig. 4.12b Exchanging colors causes the blue 
share red vertices vertices to become connected 


52 4 The Five-Color Theorem 


What Is the Surprise? 


The four-color theorem is notorious because it is so easy to state but extremely 
difficult to prove. Therefore, it is surprising that the proof of the five-color theorem 
is elementary. The clever part of the proof is Thm. 4.8 (a planar graph must have a 
vertex of at most degree 5), which is a theorem that has nothing to do with coloring. 
Instead, it results just from counting vertices and edges. 


Sources 


For the four-color theorem see [49, 54]. The proof of the five-color theorem is based 
on [1, 53]. [16] presents numerous proofs of Euler’s formula. Kempe’s incorrect 
proof of the four-color theorem is described in [46]. 
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How to Guard a Museum 


In 1973 Victor Klee asked how many guards are need to observe all the walls of a 
museum? If the walls form a regular polygon or even a convex polygon, one guard 
is sufficient (Fig. 5.1). 


Fig. 5.1 A museum whose walls form a convex polygon 


Consider now a museum with saw-toothed walls (Fig. 5.2). Verify by counting 
that the museum has 15 walls. Each “tooth” defines a triangle that is shaded gray in 
Fig. 5.3. A guard placed anywhere within one of the triangles can observe all the 
walls bounding that triangle (red arrows). 


Fig. 5.2 A museum whose walls do not form a convex polygon 
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Fig. 5.3 Visibility within each “tooth” 


If at least one of the guards is placed near the top wall spanning the entire museum, 
she can observe all the horizontal walls (blue arrows in Fig. 5.4). Thus 5 = 15/3 
guards are sufficient to observe all the walls of the museum. Since the triangles do 
not overlap a guard in one triangle will not be able to observe all the walls of another 
triangle (green arrow) so 5 guards are necessary. 


Fig. 5.4 Visibility of the walls of the museum 


The example in Fig. 5.2 can be generalized to n/3 teeth with n walls, so we 
conclude that at least n/3 guards are necessary. We wish to prove that n/3 guards 
are sufficient to guard any museum. 

Section 5.1 proves that any triangulated polygon can be three-colored. This is 
used in Sect. 5.2 to prove the theorem that n/3 guards are sufficient. Section 5.3 
completes the proof by showing that any polygon can be triangulated. 


5.1 Coloring Triangulated Polygons 


Definition 5.1 A diagonal a of polygon is an edge connecting two vertices that is 
not one of the (outside) edges of the polygon. 


Definition 5.2 A polygon can be triangulated if non-intersecting diagonals can be 
constructed such that the interior of the polygon is covered by triangles. 


Theorem 5.1 Any polygon can be triangulated. 


We defer the proof of Thm. 5.1. 
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Definition 5.3 A vertex of a polygon is convex if its interior angle is less than 180°; 
a vertex is concave if its interior angle is greater than 180°. 


In Fig. 5.5 vertex 1 is convex and vertex 2 is concave. 


Fig. 5.5 A polygon with a convex vertex (1) and a concave vertex (2) 


Definition 5.4 A polygon with vertices V can be three-colored if there is a map: 
c: V > {red, blue, green} , 
such that no edge has two vertices that are assigned the same color. 


Theorem 5.2 A triangulated polygon can be three-colored. 


Proof By induction on the number of vertices. A triangle can be three-colored. A 
triangulated polygon with n > 3 vertices must have a diagonal. Choose an arbitrary 
diagonal AB (Fig. 5.6a) and divide the polygon along this diagonal into two smaller 
polygons (Fig. 5.6b). By induction each of these smaller polygons can be three- 
colored (Fig. 5.7a). 

Since the colors assigned are arbitrary, if different colors are assigned to A, B in 
the two polygons, we can rename the colors in one of them so that the colors of A, B 
are the same in both polygons. For example, in Fig. 5.7b exchange red and green in 
the lower polygon. Paste the two polygons together to recover the original polygon 
with n vertices. It will be three-colored (Fig. 5.8). o 
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A B 
ay 
Fig. 5.6a An arbitrary diagonal in a polygon Fig. 5.6b Divide the polygon 
A B A a 
Fig. 5.7a Three-color the two smaller poly- Fig. 5.7b Exchange the colors of one polygon 


gons to match the other 


Fig. 5.8 Paste the two smaller polygons back together 


5.2 From Coloring of Polygons to Guarding a Museum 
Theorem 5.3 A museum with n walls can be guarded by n/3 guards. 


Proof By Thm. 5.1 the polygon can be triangulated and by Thm. 5.2 the polygon 
can be three-colored. All three vertices of each triangle in the triangulation must be 
colored by different colors in order to satisfy the condition of being three-colored. 
Since the polygon is three-colored, at least one color, say red, can appear at most n/3 
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Fig. 5.9 The exterior angles of a convex polygon 


times, and every triangle must have a vertex colored red. Station a guard at each red 
vertex; she can observe all the walls of the each triangle the vertex belongs to. Since 
the triangles of the triangulation include all the edges of the polygon, n/3 guards are 
sufficient to observe all the walls of the museum. o 


If n is not divisible by 3 the number of guards needed is |n/3], the largest integer 
less than or equal to n/3. For example, 4 guards are sufficient for museums with 
12, 13, 14 walls since | 12/3] = [13/3] = [14/3] = 4. For simplicity we ignore this 
complication. 


5.3 Any Polygon Can Be Triangulated 


Theorem 5.4 The sum of the interior angles of a polygon with n vertices is: 


180° (n - 2). 


Proof Consider a convex polygon and denote its exterior angles by 6; (Fig. 5.9). As 
you move from one dashed line in sequence to the next dashed line, you complete a 


rotation around a circle so: 
n 


Xo: = 360°. 


i=] 


For each exterior angle 9; denote its corresponding interior angle by ¢;. Then: 


y 6; = Vaso — ġi) = 360° 
i=1 i=] 


2, $i = n- 180° — 360° = 180° (n — 2) . 
i=l 
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Fig. 5.10 A concave vertex 


If there is a concave vertex (B in Fig. 5.10), there is a triangle formed by the two 
edges incident with the concave vertex and the line AC connecting the other two 
vertices. By summing the angles of the triangle we get: 


(180° — a) + (360° — 8) + (180° — y) = 180° 
a+B+y =3-180°. 


The sum of the interior angles increases by œ + 8 +y while the number of vertices 
increases by three preserving the equation in the theorem: 


Y gi + (a+ B+ 7) = 180°(n 2) +3 - 180° 


i=l 
= 180°((n +3) - 2). o 
Theorem 5.5 There must be at least three convex vertices in a polygon. 


Proof Let k be the number of concave vertices where the interior angle of each is 
180° + €;, c; > 0. The sum of the interior angles of the concave vertices is certainly 
less than or equal to the sum of the interior angles of all the vertices: 


k 
k - 180° + 3 ci < 180°(n—2) 
{=l 


A 


k 
(k +2) 180° + > é; < n- 180° 
{=l 
(k +2) - 180° < n- 180° 
k<n-2. 


It follows that there must at least three vertices that are convex, not concave. o 


Proof (Theorem 5.1) By induction on the number of vertices. For n = 3 there is 
nothing to prove. If n > 3, by Thm. 5.5 there must be a convex vertex C. Label its 
adjacent vertices by B, D. If BD is contained within the polygon (Fig. 5.1 1a), it is 
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E Cc E C 
D D 
I 
I 
A B A B 
Fig. 5.1la Triangulation where a diagonal is Fig. 5.11b Triangulation where a diagonal is 
contained within the polygon not contained within the polygon 


a diagonal and the polygon can be split into ABCD and another polygon ABDE 
with BD as an edge and which is smaller than the original polygon (Fig. 5.11a). By 
the inductive hypothesis, the polygon can be triangulated and then pasted back to 
ABCD, triangulating the original polygon. 

If BD is not contained in the polygon, there must be concave vertex F that is 
closest to C (Fig. 5.11b). CF is a diagonal and splits the polygon into two smaller 
polygons CFED and CFAB. By the inductive hypothesis these can be triangulated 
and pasted together. o 


What Is the Surprise? 


The museum theorem is suprising because what seems to be a theorem in geometry 
is proved rather elegantly by an appeal to coloring a graph. 


Sources 


This chapter is based on [1, Chap. 39]. 
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A 
Check for 


Chapter 6 isi 
Induction 


The axiom of mathematical induction is used extensively as a method of proof in 
mathematics. This chapter presents inductive proofs of results that may not be known 
to the reader. We begin with a short review of mathematical induction (Sect. 6.1). 
Section 6.2 proves results about the familiar Fibonacci numbers while Sect. 6.3 proves 
results about Fermat numbers. Section 6.4 presents the 91-function discovered by 
John McCarthy; the proof is by induction on an unusual sequence: integers in an 
inverse ordering. The proof of the formula for the Josephus problem (Sect. 6.5) is 
also unusual because of the double induction on two different parts of an expression. 


6.1 The Axiom of Mathematical Induction 


Mathematical induction is the primary method of proving statements to be true for 
an unbounded set of numbers. Consider: 


l=1, 142=3, 14+24+3=6, 14+24+3+4=10. 
We might notice that: 
1=(1-2)/2, 3=(2-3)/2, 6=(3-4)/2, 10=(4-5)/2, 


and then conjecture that for all integers n > 1: 


n 


yi B mnt 1) 


i=l 


If you have enough patience, checking this formula for any specific value of n is easy, 
but how can it be proved for all of the infinite number of positive integers? This is 
where mathematical induction comes in. 
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6 Induction 


Axiom 6.1 Let P(n) be a property (such as an equation, a formula, or a theorem), 


where n is a positive integer. Suppose that you can: 


¢ Base case: Prove that P(1) is true. 


e Inductive step: For arbitrary m, prove that P(m + 1) is true provided that you 


assume that P(m) is true. 


Then P(n) is true for all n > 1. The assumption that P(m) is true for arbitrary m is 


called the inductive hypothesis. 


Here is a simple example of a proof by mathematical induction. 


Theorem 6.2 Forn > 1: 


Proof The base case is trivial: 


ye ue) 


i=1 


The inductive hypothesis is that the following equation is true for m: 


re man D) 


i=1 


The inductive step is to prove the equation for m + 1: 


m+1 m 
vi = ` i+(m+1) 
i=l i=l 
m(m +1) (m+1)(m+2) 


Se aa +(m+1)= 


2 


By the principle of mathematical induction, for any n > 1: 


si = mnt 1) l 


i=1 


o 


The inductive hypothesis can be confusing because it seems that we are assuming 
what we are trying to prove. The reasoning is not circular because we assume the 
truth of a property for something small and then use the assumption to prove the 


property for something larger. 


Mathematical induction is an axiom so there can be no question of proving induc- 
tion. You just have to accept induction like you accept other axioms of mathematics 
such as x + 0 = x. Of course, you are free to reject mathematical induction, but then 


you will have to reject much of modern mathematics. 


6.2 Fibonacci Numbers 63 


Mathematical induction is a rule of inference that is one of the Peano axioms 
for formalizing natural numbers. The well-ordering axiom can be used to 
prove the axiom of induction and, conversely, the axiom of induction can 
be used to prove the well-ordering axiom. However, the axiom of induction 
cannot be proved from the other, more elementary, Peano axioms. 


6.2 Fibonacci Numbers 


Fibonacci numbers are the classic example of a recursive definition: 


fi=l 
h=1 
Sn = fa-1 + fa-2 for n23. 


The first twelve Fibonacci numbers are: 

1,1,2,3, 5,8, 13,21, 34,55, 89, 144. 
Theorem 6.3 Every fourth Fibonacci number is divisible by 3. 
Example 6.1 f4=3=3-1, fg =21=3-7, fig =144=3. 48. 


Proof Base case: f4 = 3 is divisible by 3. The inductive hypothesis is that f4n is 
divisible by 3. The inductive step is: 


Sane) = Sansa 
= fans3 + fans2 
= (fans2 + fansi) + fans2 
= ((fanti + fan) + fanei) + fanse 
= ((fansi + fan) + fansi) + (fanei + fan) 
= 3 fansi +2 fan - 


3 fant is divisible by 3 and, by the inductive hypothesis, f4n is divisible by 3. 
Therefore, f4(n+1) is divisible by 3. o 
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7 n 
Theorem 6.4 fn < (G) ; 


7\' IN 49 a tt ok l 
Proof Base cases: fi = 1 < Z and h =1 < ri ETA The inductive step is: 


Jn+1 = Sh + faa 


since: 


n_ an 1 M _ 
tn $ - ? where (o = n (o = ai 
Proof We first show that ¢? = ¢ + 1: 
1 5 á 
2 + 
21,285 5. 1,35 +1 
4 4 4HR 2 
=ġ+l1. 


Similarly, we can show that ¢* = ¢ + 1. 
The base case for Binet’s formula is: 


z l+y5 _ 1-v5 
TE eA 


6.2 Fibonacci Numbers 
Assume the inductive hypothesis for all k < n. The inductive step is: 
grt! — gm! = gg! _ gg! 
= (p+ 1g"! - (b+) ge"! 


= (6" - 6") +(¢" |-¢"") 
= V5 fr + V5 fai 


n+l _ 7nt+l 
aa = Sh + fn-1 = fnri . 


v5 


Theorem 6.6 


(elt lera) era 


n n E n! n! 
l i a ,) = ka-p! k+Din-(k+))! 


R ni(k +1) n\(n — k) 
= k+ Dn- k)! (einen)! 
7 n\(n+1) 
~ (k+1)'(n—k)! 
(n+ 1)! 


~ (kt DN(n+1) —(k+))! 
_ n+1 
= (ota) 


k k! 
We will also use the equality (3 


~ Ok — 0)! 


We can now prove the theorem. The base case is: 


1 1! 
n=) =a o z 


= ] for any k 2 1. 
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The inductive step is: 


Jn = fn-1 + fn-2 = o)e 7 


6.3 Fermat Numbers 


Definition 6.1 The integers F, = 27" + 1 for n > 0 are called Fermat numbers. 
The first five Fermat numbers are prime: 
Fo =3, Fr =5, Fo =17, F3=257, Fy = 65537. 


The seventeenth-century mathematician Pierre de Fermat claimed that all Fermat 
numbers are prime, but nearly a hundred years later Leonhard Euler showed that: 


Fs = 27 +1 = 23241 = 4294967297 = 641 x 6700417. 


Fermat numbers become extremely large as n increases. It is known that Fermat 
numbers are not prime for 5 < n < 32, but the factorization of some of those 
numbers is still not known. 


Theorem 6.7 For n > 2, the last digit of Fn is 7. 
Proof The base case is F = 2?’ +1 = 17. The inductive hypothesis is F, = 10k, +7 
for some kn > 1. The inductive step is: 
gn+l 27.21 Qn 2 
Fai so? e123 +1=(2 ) +1 


ES 2 
Ao +1)-1) +1=(F,-1)}+1 
= (10k +7 — 1) +1 = (10k, +6)? +1 
= 100k? + 120k, +36 + 1 
= 10(10k2 + 12k, +3)+6+1 


= l0kn+1 +7, forsome kny 2 1. oO 
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n-1 

Theorem 6.8 Forn > 1, Fy, = I] Fg +2. 
k=0 


Proof The base case is: 


0 
Fi =| | Fk +2= Fo +2=3+2=5. 
k=0 


The inductive step is: 


z a +1 -2) Z + 1) 


n\2 n+ 
= (2"") -1=(? '+1)-2 


= nel — 2 
n 
Foi =| | Fe +2. o 
k=0 


6.4 McCarthy’s 91-function 


We usually associate induction with proofs of properties defined on the set of positive 

integers. Here we bring an inductive proof based on a strange ordering where larger 

numbers are less than smaller numbers. The induction works because the only 

property required of the set is that it be ordered under some relational operator. 
Consider the following recursive function defined on the intergers: 


f(x)=if x > 100 then x- 10 else f(f(x+11)). 
For numbers greater than 100 the result of applying the function is trivial: 
f(101)=91, f(102)=92, f(103)=93, f(104) =94,.... 


What about numbers less than or equal to 100? Let us compute f(x) for some 
numbers, where the computation in each line uses the results of previous lines: 
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(100) = f(f(100 + 11)) = f(F(111)) = f(101) = 91 
(99) = f(f(99 +11) = f(F(110)) = F(100) = 91 
(98) = f(f(98 +11) = f(F(109)) = f(99) = 91 


F(91) = fF(FO1 +11) = F(F(102)) = f(92) 
= f (f (103)) = f (93) =-+- = f (98) =91 
f(90) = f(f(90+11)) = f(f(101)) = f(91) =91 


f(89) = fF(F(89+11)) = f(f(100)) = f(91) =91. 
Define the function g as: 
g(x) =if x > 100 then x- 10 else 91. 


Theorem 6.9 For all x, f (x) = g(x). 


Proof The proof is by induction over the set of integers S = {x |x < 101} using the 
relational operator < defined by: 


y <x ifandonlyif x< y, 


where on the right-hand side < is the usual relational operator on the integers. This 
definition results in the following ordering: 


101 < 100 < 99 < 98 < 97 <.---. 


There are three cases to the proof. We use the results of the above computations. 
Case 1: x > 100. This is trivial by the definitions of f and g. 
Case 2: 90 < x < 100. The base case of the induction is: 


f(100) = 91 = g(100), 


since we showed that f(100) = 91 and by definition g(100) = 91. 
The inductive assumption is f(y) = g(y) for y < x and the inductive step is: 


f(x) = f(f@+11)) (6.1a) 
= f(x+11-10) = f(x +1) (6.1b) 
= g(x + 1) (6.1c) 
=91 (6.1d) 
= g(x). (6.1e) 


Equation 6.la holds by definition of f since x < 100. The equality of Eq. 6.1a and 
Eq. 6.1b holds by the definition of f, because x > 90 sox +11 > 100. The equality 
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of Eq. 6.1b and Eq. 6.1c follows by the inductive hypothesis x < 100, sox+1 < 101 
which implies that x + 1 € S and x + 1 < x. The equality of Eq. 6.1c, Eq. 6.1d and 
Eq. 6.le follows by definition of g and x + 1 < 101, sox < 100. 

Case 3: x < 90. The base case is: f(89) = f(f(100)) = f(91) = 91 = g(89) by 
definition of g since 89 < 100. 

The inductive assumption is f(y) = g(y) for y < x and the inductive step is: 


f(x) = f(f(+ 1) (6.2a) 
= f(g(x +11)) (6.2b) 
= f(91) (6.2c) 
=91 (6.2d) 
= g(x). (6.2e) 


Equation 6.2a holds by definition of f and x < 90 < 100. The equality of Eq. 6.2a 
and Eq 6.2b follows from the inductive hypothesis x < 90, sox + 11 < 101, which 
implies that x +11 € Sandx+11 < x. The equality of Eq. 6.2b and Eq 6.2c follows 
by definition of g andx+11 < 101. Finally, we have already shown that f(91) = 91 
and g(x) = 91 for x < 90 by definition. o 


6.5 The Josephus Problem 


Josephus was the commander of the city of Yodfat during the Jewish rebellion 
against the Romans. The overwhelming strength of the Roman army eventually 
crushed the city’s resistance and Josephus took refuge in a cave with some of his 
men. They preferred to commit suicide rather than being killed or captured by the 
Romans. According the account by Josephus, he arranged to save himself, became 
an observer with the Romans and later wrote a history of the rebellion. We present 
the problem as an abstract mathematical one. 


Definition 6.2 (Josephus problem) Consider the numbers 1,...,n+1 arranged in 
a circle. Delete every g’th number going around the circle g,2q,3q,... (where 
the computation is performed modulo n+1) until only one number m remains. 
J(n+1,q) = m is the Josephus number for n + 1 and q. 


Example 6.2 Let n + 1 = 41 and let q = 3. Arrange the numbers in a circle: 


> 123 45 67 8 9101112131415 161718192021 | 
T 41 40 39 38 37 36 35 34 33 32 31 30 29 28 27 26 25 2423220 <— 
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The first round of deletions leads to: 


> 1234567 8 9101112131475 16178192071 | 
T 41 40 399 38 37 36 35 34 33 32 31 30 29 28 97 26 25 %4 2322 e 


After removing the deleted numbers this can be written as: 

124578 10 11 13 14 16 17 19 20 22 23 25 26 28 29 31 32 34 35 37 38 40 41 

The second round of deletions (starting at the last deletion of 39) leads to: 
Y24978 y0 11 13 y4 16 17 y9 20 22 73 25 26 98 29 31 y2 34 35 y7 38 40 41 

We continue deleting every third number until only one remains: 


2 4 7 811 ¥3 16 17 20 22 25 26 29 31 34 35 38 40 
2 4 £11 16 V7 22 25 29 31 35 38 

2 4 Yi 16 22 35 31 35 

2 416 22 31 35 

A 163135 

y6 31 

31 


It follows that J(41,3) = 31. 


The reader is invited to perform the computation for deleting every seventh 
number from a circle of 40 numbers in order to verify that the last number is 30. 


Theorem 6.10 J(n + 1,q) = (J(n,q) +q) (mod n +1). 


Proof The first number deleted in the first round is the g’th number and the numbers 
that remain after the deletion are the n numbers: 


12... q-l1 q+l1...nn+1 (modn+l1). 


Counting to find the next deletion starts with q + 1. Mapping 1,...,” into this 
sequence we get: 


1 2 ...n-q n+l-q n+2-q...n-1 n (mod n+!) 
l l ) l l l l 
q+1 q+2... n n+1 1 ...g-2 q-1 (mod n+l). 


Remember that the computations are modulo n + 1: 


(n+2-q)+q=(n+1)+1 = 1 (mod n+ 1) 
(n)+q =(n+1)-1+q=q-1 (modn+1). 


6.5 The Josephus Problem 71 


This is the Josephus problem for n numbers, except that the numbers are offset 
by q. It follows that: 


J(n+1,q) =(J(n,q)+q) (modn+1). go 


Theorem 6.11 Forn > 1 there existnumbers a > 0,0 < t < 2%, such that n = 2° +t. 


Proof This can be proved from repeated application of the division algorithm with 
divisors 2°,2!,27, 24, ..., but it is easy to see from the binary representation of n. 
For some a and ba-1, bg_2,...,b1, bo, where for all i, bi = 0 or b; = 1, n can be 
expressed as: 


n =2" + ba-12%! ee ED’ 
n = 2% + (ba-12%! +--+ + b02?) 


n=2%+t, wheret < 27-1. o 


We now prove that there is simple closed form for J (n, 2). 
Theorem 6.12 For n =2° +t, a > 0,0 < t < 2%, J(n,2) =2t +1. 


Proof By Thm. 6.11, n can be expressed as stated in the theorem. The proof that 
J(n, 2) = 2t + 1 is by a double induction, first on a and then on t. 

First induction: 

Base case. Assume that t = 0 so that n = 2%. Let a = 1 so that there are two 
numbers in the circle 1,2. Since q = 2, the second number will be deleted, so the 
remaining number is 1 and J(2!,2) = 1. 

The inductive hypothesis is that J(2“,2) = 1. What is J(2“*!,2)? In the first 
round all the even numbers are deleted: 


1 2 3 4 ... 21-1 2%, 
There are now 2% numbers left: 
VS. ee (2SFLET., 


By the inductive hypothesis J(2¢*!,2) = J(2%,2) = 1 so by induction J(n,2) = 1 
whenever n = 2° +0. 
Second induction: 
We have proved J(2% + 0,2) = 2 - 0 + 1, the base case of the second induction. 
The inductive hypothesis is /(2% + t,2) = 2t + 1. By Thm. 6.10: 


J(2% + (t+1),2) = J(2% +t,2)+2=2t+1+2=2(t+1)+1. o 
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Theorems 6.11 and 6.12 give a simple algorithm for computing J (n, 2). From the 
proof of Thm. 6.11: 


n=2% +t =2" + (ba-12%! +--+ bo2®), 


so t = ba-12%7! +- - -+bo2?. We simply multiply by 2 (shift left by one digit) and add 
1. For example, since n = 41 = 2°+23+2° = 101001, it follows that J (41,2) = 2t+1, 
and in binary notation: 


41 = 101001 
9 = 01001 
2t +1 = 10011 = 16+2+1=19. 


The reader can verify the result by deleting every second number in a circle 1,...,41. 
There is a closed form for J (n, 3) but it is quite complicated. 


What Is the Surprise? 


Induction is perhaps the most important proof technique in modern mathematics. 
While Fibonacci numbers are extremely well-known and Fermat numbers are also 
easy to understand, I was surprised to find so many formulas that I never knew (such 
as Thms. 6.3 and 6.4) that can be proved by induction. McCarthy’s 91-function was 
discovered in the context of computer science although it is a purely mathematical 
result. What is surprising is not the function itself, but the strange induction used to 
prove it where 98 < 97. The surprise of the Josephus problem is the bidirectional 
inductive proof. 


Sources 


For a comprehensive presentation of induction see [21]. The proof of McCarthy’s 91- 
function is from [30] where it is attributed to Rod M. Burstall. The presentation of the 
Josephus problem is based on [21, Chapter 17], which also discusses the historical 
background. That chapter contains other interesting problems with inductive proofs, 
such as the muddy children, the counterfeit coin and the pennies in a box. Additional 
material on the Josephus problem can be found in [44, 57]. 
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Solving Quadratic Equations 


Poh-Shen Loh proposed a method for solving quadratic equations that is based on a 
relation between the coefficients of the quadratic polynomial and its roots. Section 7.1 
reviews the traditional methods for solving quadratic equations. Section 7.2 tries to 
convince the reader that Loh’s method makes sense and then explains how to compute 
the roots. In Sect. 7.3 the computation is carried out for two quadratic polynomials 
and a similar computation for a quartic polynomial. Section 7.4 derives the traditional 
formula for the roots from Loh’s formulas. 

The introduction of algebra and modern algebraic notation is relatively recent. 
Previously, mathematicians used geometry almost exclusively, so it is interesting 
to look at al-Khwarizmi’s geometric construction of the formula for the roots of 
quadratic equations (Sect. 7.5). Section 7.6 shows a clever geometric construction 
used by Cardano in developing the formula for the roots of cubic equations. 

Section 7.8 presents other geometric methods for finding the roots of quadratic 
equations.! The chapter concludes with Sect. 7.9 which discusses numerical com- 
putation of the roots of quadratic equations. 


7.1 Traditional Methods for Solving Quadratic Equations 


Every student of mathematics memorizes the formula for obtaining the roots of a 
quadratic equation ax? + bx + c = 0: 


—b + Vb? —4ac 


X1, X2 = a 


1 Chapter 11 is a prerequisite for a full understanding of these methods. 
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For now we will work with monic polynomials, x? + bx + c = 0, whose roots are: 


-b + Vb? -4c 
sni = —— s. (1.1) 


Another method of solving quadratic equations is by factoring the polynomials 
more-or-less by trial-and-error. Sometimes it is easy to obtain the roots by factoring: 


x? -4x +3 = (x-1)(x- 3). (1.2) 


It is much harder to factor x? — 2x — 24 because there are many possible pairs of 
roots that must be considered: 


(+1, 24) , (+2, 12) , (+3, =8) , (+4, F6). 


7.2 The Relation Between the Roots and the Coefficients 


Theorem 7.1 If r1, 72 are the roots of x? + bx +c then: 
(x -ri)(x- r2) =x? - (rı +r2)x +rir2 =x +bx+c. 
Therefore, even if we do not know the values of the roots, we do know that: 
ry tr2 =-—b, rır =c. (7.3) 


There is really nothing to prove because the result emerges from the computation. 
Consider some values of —b, r1, r2 and let 12 be the average of 71, r2: 


For any quadratic equation the average of the two roots is constant: 


ceti = (-b-r2)+r2 = b 


™\,2 


2 2 2 


7.2 The Relation Between the Roots and the Coefficients 75 


m2 — rj mi —12 
>< 


fitri 


‘e 
m2 
r2 


Fig. 7.1 Relation between the roots r1, r2 = 2, 6 and their average mj 2 = 4 


Let s be any number. Then: 


-b= -04s +09 =(Frs}4(P-s}=n4n. 


If one root is at distance s from the average, the other root is at distance —s from the 
average. For r1, r2 = 2,6, where m2 = 4, s = 2, we have: 


=b rı) m| mp|mp-ri|m-r2 
33} 12) 21| 165) 44| -4% 
33} 8| 25| 165) 84] -8% 
33 1| 32| 165} 154| -154 
-4| -16| 12| -2 14| -14 
-4| -4| o -2 2|) -2 
-4| -3| -1| -2 1} -1 


Figure 7.1 visualizes this relation. If we use other values r1,r2 = 3,5 for which 
rı +12 = 8 then mız = 4 remains the same while s becomes | (Fig. 7.2). 
The offset s seems to be arbitrary in: 


E aa _(-b 
ry = 2 S|, r2 = 7 S|, 


but there is an additional constraint rjr2 = c, where c is the constant term in the 
polynomial. By multiplying the two expressions we have derived for r1, r2, we can 
determine s and then rj, r2: 


a eae b bB o; 
c= 2 S 7 s=% S 
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Fig. 7.2 Relation between the roots r1, r2 = 3, 5 and their average mj 2 = 4 


7.3 Examples of Loh’s Method 


Example 7.1 Consider the polynomial x? — 2x — 24 where b = —2, c = —24: 


f 2) (-2) 
e=- +s] (-S-5] 


-24 = (1+ s)(1 - s) 


s=5 
ri=1+5=6 
rm =1-5=-4. 


Check: (x — 6)(x — (—4)) = x? — 2x — 24. 
Example 7.2 Let us find the roots of x? — 83x — 2310: 


83 83 
—2310 = (= + s] (F = | 


2 9889, 5349 = 16129 


4 4 
127 
ae 
127 
ry = a8 — — =-22 
2 2 
83 127 
= — — = 1 
r2 7 + 7 05 


Check: (x +22) (x — 105) = x? — 83x — 2310. 
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Compare this computation with the computation using the traditional formula: 


-b + Vb? — 4c _ =(-83) + y¥(-83)? - 4 - (2310) 


2 2 
_ 83=vVI6I29 83+127 
E 2 — 2 
Bob 83-127 a y 
2 
83 +127 
BE =A = 105. 


Example 7.3 Theorem 7.1 can be generalized to polynomials of higher degrees. 
Here is an interesting example for a quartic equation x* — 10x? — x + 20 = 0. As 
with quadratic equations there are formulas for solving cubic and quartic equations 
(though not equations of higher powers), but the formulas are quite complicated. 

Does this polynomial of degree four factor into two quadratic polynomials with 
integer coefficients? If so, the coefficients of the x terms must be equal and of 
opposite signs since the coefficient of the x* term is zero. Therefore, the form of the 
quadratic factors is: 


f (x) = (x? -nx + ky) (x? + nx + ko). 
Carrying out the multiplication results in: 


f(x) = xt 4nx3 +kox? 
—nx3 —n?x? —nkox 


+k x2 +nk x +kyk2 z 
Equating the coefficients gives three equations in the three unknowns n, k1, k2 gives: 


(ki +k) -= n? = -10 
n(kı = k2) =-1 
kiko =20. 


Since we are looking for factors with integer coefficients, from the last two equations 
it is clear that: 


n=1,k, =4,k.=5 or n= l, kı = —5, k = —4. 


Only n = 1, kı = —5, ko = —4 satisfy the first equation for the coefficient of x: 


f(x) = (x? - x-5) (x7 +x- 4). 
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Solving these quadratic equations gives four solutions of the quartic equation: 


MESAI -1 + V17 


7.4 Derivation of the Traditional Formula 


For an arbitrary monic polynomial x? + bx + c, Loh’s formulas are: 


earn 


the traditional formula for obtaining the roots of a monic quadratic polynomial. If 
the polynomial is not monic divide it by a, substitute in the equation and simplify: 


b c 
xX +—x4+—=0 


a a 
_ —(b/a) + y (b/a)? -4(c/a) 


r1, r2 


2 
—b + Vb? — 4ac 


2a 


7.5 Al-Khwarizmi’s Geometric Solution of Quadratic Equations 


Let us write a monic quadratic polynomial as x? + bx — c. The roots can be found by 
completing the square: 


eall 


b 2 

2 
b 
any) 


2 
2 


(2 

etiz 

+ b” 

etiz 
b b\? —b+Vb24+4c 

E iaa 4 G+ = 5 
2 2 2 
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This is the familiar formula for finding the roots of a quadratic equation, except that 
4c has the opposite sign since the coefficient of the constant term was —c. 

Completing the square was developed in the 8th century by Muhammad ibn Musa 
al-Khwarizmi in a geometric context. Given the equation x? + bx = c, assume that 
there is a square whose side is x so that its area is x”. To the area x? add bx by adding 
four rectangles of area bx/4 whose sides are b/4 and x (Fig. 7.3a). Now complete 
the diagram to a square by adding the four little squares of area (b/4)? (Fig. 7.3b). 

We can’t construct the diagram in Fig. 7.3a because we don’t know what x is, but 
the area of the larger square in Fig. 7.3b is: 

RA Se. 
4 4 

which we do know since we are given the coefficients b,c. By constructing the 
diagram and erasing the small squares whose sides are (b/4)—another known 
quantity—we obtain the line segment of length x. 


Example 7.4 Letx*+12x = 64. Then c+(b?/4) = 64+36 = 100. Itis easy to construct 
a square of area 100 since each side has length 10. Now subtract (b/4) + (b/4) = 6, 
the sides of the smaller squares, to get x = 10 — 6 = 4. 


7.6 Cardano’s Construction for Solving Cubic Equations 


The formula for the roots of cubic equations was first published in the 16th century 
by Gerolamo Cardano. We will not develop the formula here, but it is interesting 
that the central idea is based on a geometric construction similar to al-Khwarizmi’s. 


b b 1 b b l 
4 x 4 1 4 x 4 1 
b b b b 
4 4 4 4 
x x x x 
b b b b 
4 4 4 4 
I 
b i b ' b x% b 1 
4 4 1 4 4 1 
| eae ee | 
Fig. 7.3a The areais x7+4(b/4)x = x?+bx Fig. 7.3b The area is x? + 4(b/4)x + 


4(b/4)? = x? + bx + (b*/4) 
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The construction can be obtained very simply using algebra. By multiplication: 
(a+b) =a? +3a7b + 3ab* + b? = (a? + b°?) +3ab(atb). (7.4) 


Geometrically, we start with a cube whose side is a + b so that its volume is (a + b)’. 
The cube is decomposed into five pieces. The first two are cubes whose sides are a 
and b with volumes a? (blue) and b? (red), respectively (Fig. 7.4). 

The other three parts are boxes (the technical term is cuboid) each with one side 
of length a + b coinciding with a side of the cube, one side of length a and one side 
of length b, so that the volume of each of the three boxes is ab(a + b). In Fig. 7.5, 
there is one box at the left side of the cube (blue), one at the back of the cube (red) 
and one at the top of the cube (green). By combining the five solids in Fig. 7.4 and 
Fig. 7.5 we obtain Eq. 7.4. 


b,at+b,at+b atb,atb,atb 
0,atb,atb 
O,atb,a fb abb.a 
Gales atb,d,atb 
i b,a,a+b 
0,a,a atb,a,a 
b,a,a 
0,at+bs0 
atb,atb, 
b,at+b,0 
b,a,0 ‘atb,a,0 
at+b,0lat+b 
0,0,d+b 
0,0,a ‘atb,0,a 
b,0,a 
0,0,0 b,0,0 at+b,0,0 


Fig. 7.4 (a? +b?) = (a? +b?) phus 
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b,atb,atb a+b,a+b,a+b 
0,a+b,a+b 
0,a+b,a b,atb,a 
dalar a+b,d,a+b 
b,a,a+b 
0,a,a a+p,a,a 
b,a,a 
O,at 
atb,at 
b,at+b,0 
b,a,0 ‘atb,a,0 
a+b,0ja+b 
0,0,d¢b 
0,0,a ‘atb,0,a 
b,0,a 
0,0,0 b,0,0 at+b,0,0 
Fig. 7.5 (a? +b?) =---+3ab(a+b) 


7.7 They Weren’t Intimidated by Imaginary Numbers 


The history of mathematics demonstrates a progression of concepts that were initially 
considered to be meaningless, but were eventually understood, accepted and proved 
to be useful. “Obviously,” since numbers count things, —1, a negative number, is 
meaningless. “Obviously,” since numbers are ratios of integers (rational numbers), 
V2, which can easily be proved to be irrational, is meaningless. “Obviously,” V—1, the 
square root of a negative number, is meaningless since there is no number—integer, 
rational or real—whose square is —1. 

A full understanding of the square roots of negative numbers, to this day called 
imaginary numbers although they are no less real than real numbers, was not achieved 
until the nineteenth century. Therefore, it is surprising that already in the sixteenth 
century, Geralamo Cardano and Rafael Bombelli refused to be intimidated by the 
concept, and took the first small steps towards understanding these numbers. 
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Consider the quadratic equation: 
x? -10x+40=0. (7.5) 
By the familiar formula (Eq. 7.1): 


10+ V100- 1 
pose Ve eens 


1,72 = 2 


Well, we don’t know anything about the square roots of negative numbers and we 
don’t know what these values are, but like Cardano we do know by Thm 7.1 that: 


rı trp = (5 + V=15) + (5 - V-15) = 10 = -b 
rir = (5+ V-15)(5 — V-15) = 25 — 5V-15 + 5V-15 - (-15) = 40 =c. 


which correspond with the coefficients of the quadratic equation Eq. 7.5. It is rather 
intuitive that V-15 + (V-15) = 0 even if we know nothing about V-15, and, 
similarly, it is rather intuitive that V-15 - -(V-15) = —(—15) = 15 even if we don’t 
know what V-15 is. 

Consider now the cubic equation: 


x- 15x-4=0. (7.6) 


It is not hard to observe that 4 is a root, but how can it be computed? Cardano’s 


formula gives the root: 
= 2+ 11V-1+42-11V-1, (7.7) 


a quite complicated formula that bears no obviously relation to 4. 
Bombelli courageously performed the following computation (see Eq. 7.4): 


(2+ V-1)? = 8+3- 4V-1 +3- 2(-1) +(-1V-1) =24+11V-1 
(2 -= V-1)? = 8-3 -4V-1+3-2(-1) - (-1V-1) =2-11V-1, 


and by Eq. 7.7: 


E 


= Je +V + 2- vay? 
= (2+ V-1) + (2 - V-1) =4 
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Fig. 7.6 Lill’s method on x? — 4x + 3 


7.8 Lill’s Method and Carlyle’s Circle 


Lill’s method can be applied to solve quadratic equations.2 As an example we use 
Eq. 7.2 which gives the roots of a quadratic equation obtained by factorization: 


x +bx+ce =x —4x43 =(x-1)(x-3). 


Applying Lill’s method results in the paths shown in Fig. 7.6. 
Check that the angles are correct: 


—tan(-—45°) = —1, -—tan(-—71.57°) x -3. 


For quadratic equations we can find the points Pı, P2 as the intersections of the line 
representing the coefficient b and the circle whose diameter is the line connecting 
the starting point and the end point of the paths (Fig. 7.7). In order for a point on the 
line b to be a root, the reflection of the line must be 90° and therefore the inscribed 
angle is subtended by a diameter. 

This can also be checked by computation. The center of the circle is the midpoint 
of the diameter (—1, —2). The length of the diameter is: 


(—2)? + (-4)? = V20, 


2 This section assumes that you have read about Lill’s method in Chap. 11. 
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Fig. 7.7 Constructing a circle to find the roots 


2 
so the square of the length of the radius is (v20 2) = 5. We need the intersection 
of this circle and the line x = 1: 


(@- (“DY a er? 
(x? + 2x +1) + (y? +4y 44) =5 
y? +4y+3=0 

y=-l, -3. 


A similar method for solving quadratic equations is the Carlyle circle which 
predates Lill’s method. Given a quadratic equation x? — bx + c (note the minus sign 
on the linear term), construct points at (0,1) and (b,c). Construct a circle whose 
diameter is the line connecting the two points (Fig. 7.8). Its intersections (if any) 
with the x-axis are the roots of the equation. 

In the general case, the center of the circle is (b/2, (c — (—1))/2) and the length 
of the diameter is yb? + (c — 1)?, so the equation of the circle is: 


b-i) pE) = Pee 


2 4 


For the example, substituting b = 4, c = 3 and y = 0, we see that x = 1 and x = 3 are 
the roots of the quadratic equation. 
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Fig. 7.8 Carlyle circle for x? — 4x +3 


7.9 Numerical Computation of the Roots 


Students learn symbolic computation of roots, derivatives and so on. Today, most 
computation is performed by computers so symbolic computation is less important. 
Numerical analysis is the branch of mathematics and computer science that develops 
accurate and efficient computational methods. The main challenge is to deal with the 
finiteness of values stored in the computer’s memory. The computation: 


0.12 x 0.14 = 0.0168 


is easy to do, but: 
0.123456789 x 0.123456789 


needs eighteen digits to be accurately represented and this cannot be done in a 
memory word that stores sixteen digits. This error is called a round-off error. 
An even more serious problem is encountered when floating point arithmetic is 
performed. Clearly: 
(0.12 x 10710) x (0.14 x 1078) 


would not be computed by writing out all the zero digits. Instead, we multiply the 
mantissas and add the exponents to obtain 0.0168 x 10718, which is normalized 
to 0.168 x 107! so that the most significant digit appears after the decimal point, 
ensuring maximum precision given the fixed size of the mantissa. If the maximum 
exponent that can be represented is — 16 the result simply cannot even be stored. This 
error is called floating-point underflow. 
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The formula for finding the roots of the quadratic equation x? + bx + c is: 


-b + Vb? — 4c 
5 


1,72 = 


(7.8) 


Consider what happens if b = 1000 and c = 4. The roots are: 


—1000 + ¥1000000 — 16 
5 f 


1,72 = 


Depending on the precision of the arithmetic, it is possible that one of the roots is 
so close to zero that the value stored is zero. Evaluating the quadratic equation gives 
the surprising result 0? +b -0+4=4=0. 

Can we do better? By Eq. 7.3: 


ritr =b, rın =c. 


If r2 is much less that r1, written r2 « r1, then rı ~ —b and r2 = c/b. Table 7.1, 
computed by a computer program, compares the values of the roots computed by 
these formulas with the values obtained from the traditional formula Eq. 7.8. The 
value of c is fixed at 4 and the roots for increasing values of b are shown. 

Initially, the true values computed by the traditional formula for r2 are more 
accurate (r2 — r2y is negative) but from b = 100000, the computation based upon 
Eq. 7.3 is more accurate. Such are the surprises of numerical analysis. 


Table 7.1 Two computations of the roots of a quadratic equation. rı, r2 are the roots computed 
by Eq. 7.8. riv, 72, are the roots computed using Eq. 7.3. The errors are r; — riy. The values are 
truncated to four decimal places. Floating-point numbers are written —4e — 5 in place of 4 x 1075 
because computer programs are normally written as linear sequences of characters. 


b ri Fiv Error, r2 ry Error 

100 —99.9599 —100 0.0400 —0.04001 —0.04 -1.6012e-05 
1000 —999.9959 —1000 0.0040 —0.0040 0.004 —1.6000e -08 
10000 —9999.9996 — 10000 0.0004 0.0004 -0.0004 -1.6270e-11 


100000 -99999.9999 —100000 3.9999e-5 -3.9999e-5 4e-5 1.0104e-12 
1000000 -—-999999.9999 —1000000 4.0000e-6 -3.9999e-6 4e-6 2.7749e-11 
10000000 -—10000000.0 -10000000 3.9860e-—7 -3.9953e-7 4e-7 4.626le-10 
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What Is the Surprise? 


Poh-Shen Loh’s approach provides a new way of looking at the relation between the 
coefficients and the roots that one doesn’t see simply by memorizing the traditional 
formula. What is surprising is that this relation is fundamental in Gauss’s algebraic 
proof of the constructibility of a regular heptadecagon (Chap. 16). 

With the modern dominance of algebraic methods in geometry it is important 
to be reminded that the reverse once held. As shown by the constructions of Al- 
Khwarizmi and Cardano, geometric methods were used to obtain results in algebra. 
Lill and Carlyle both developed geometric methods for solving quadratic equations. 
Considerations of numerical computation on computers will surprise students who 
have not experienced it before. 


Sources 


Poh-Shen Loh’s method is from [28, 29]. Al-Khwarizmi’s construction is from 
[6, Chapter 1] and [32]. Cardano’s construction can be found in [6, Chap. 1]. For 
the colorful history of the development of Cardano’s formula see [52]. The early 
attempts at computing with imaginary numbers are from [6, Chapter 2]. Lill’s method 
and Carlyle’s circle can be found in [61] together with a discussion of numerical 
computation of the roots. 
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A 
Check for 


Chapter 8 ta 
Ramsey Theory 


Ramsey theory is a branch of combinatorics that asks questions of the form: How 
large must a set be so that if it is divided into subsets, at least one subset has a 
certain property? Results in Ramsey theory are difficult to prove and there remain 
many open problems. In this chapter we present simple cases of four problems to 
give a taste of this fascinating subject: Schur triples (Sect. 8.1)—triples of integers 
such that a + b = c, Pythagorean triples (Sect. 8.2)—triples of integers such that 
a? + b? = c?, van der Waarden’s problem (Sect. 8.3) which concerns sequences of 
numbers, and Ramsey’s theorem (Sect. 8.4) on coloring graphs. Section 8.5 shows 
how the probabilistic method in combinatorics can be used to develop a lower bound 
for Ramsey numbers. 

The Pythagorean triples problem was recently solved with the aid of computers, 
using a relatively new method call SAT solving. For readers familiar with proposi- 
tional logic Sect. 8.6 gives an overview of how this is done. 

Section 8.7 describes Pythagorean triples as known to the Babylonians four 
thousand years ago. 

Terminology: monochromatic means of the same color. 


8.1 Schur triples 


Definition 8.1 Given any decomposition of the set of positive integers: 
S(n) = {1,... n} 


into two disjoint subsets S1, S2, do there exist {a,b,c} © S1 or {a,b,c} C S2 (or 
both) such that a < b <c and a +b = c? If so, the set {a, b, c} is called a Schur triple. 


Example 8.1 For n = 8, in the decomposition: 


S1 = {1,2,3,4}, So = {5,6,7,8}, (8.1) 
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the set Sı includes the Schur triple {1, 2, 3}. However, the decomposition: 


Si = {1, 2,4, 8}, 85 = {3,5,6,7}, (8.2) 


does not contain a Schur triple, as you can check by enumerating all the triples in 
each subset. 


Theorem 8.1 Jn all decompositions of S(9) = {1,...,9} into two disjoint subsets, 
at least one subset contains a Schur triple. 


Of course we could check the 2? = 512 decompositions of S(9) into two disjoint 
subsets, but let us try come up with a more succinct proof. 


Proof We try to construct a decomposition that does not contain a Schur triple and 
show that the constraints of the problem make this impossible. Start by placing 1 
and 3 into the subset S1. 2 must be placed in S2 because 1 + 2 = 3 and we are trying 
to construct a decomposition that does not contain a Schur triple. Similarly, 4 must 
be placed in S2 because 1 + 3 = 4. Continuing, 6 is placed in Sı because 2+ 4 = 6 
and 7 is placed in Sz because 1 + 6 = 7. However, 3+ 6 = 9 and 2 +7 = 9, so 9 must 
appear in both Sı and S2, a contradiction. The sequence of inferences is shown in 
the following table: 


S1 So 

1,3 

1,3 2 

1,3 2,4 
1,3,6 2,4 
1,3,6 2,4,7 
1,3,6,9 2,4,7 
1,3,6,9 2,4,7,9 


Backtracking, we search for a decomposition where 1,3 are in different subsets. If 
we place 5 into S2, a sequence of inferences again leads to a contradiction because 9 
must appear in both subsets. The reader should justify each of the inferences shown 


in the following table: 


Sy So 

1 3 

1 3,5 

1,2 3,5 

1,2,8 3,5 

1,2,8 3308 2 
1,2,8 3,5,7,9 
1,2,8 3,5,6,7,9 
1,2,8,9 3,5,6,7,9 
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Backtracking again, we try to place 5 into S4, but that also leads to a contradiction, 
as shown in the following table: 


S1 So 

1 3 

1,5 3 

1,5 3,4 
1,5 3,4, 6 
1,2,5 3,4, 6 
1,2,5 3,4,6,7 


1,2,5,7 3,4, 6,7 
If follows that there is no decomposition that does not include a Schur triple. o 


Issai Schur proved the following theorem: 


Theorem 8.2 (Schur) For every k > 2 there is a smallest n such that in any disjoint 
decomposition of S(n) into k subsets, at least one of the subsets must contain a Schur 
triple. 


8.2 Pythagorean Triples 


Definition 8.2 Given any decomposition of the set of positive integers: 
S(n) = {1,...,n} 


into two disjoint subsets S1, S2, do there exist {a,b,c} © S1 or {a,b,c} © S2 (or 
both) such that a < b < c and a? + b? = c*? If so, {a, b, c} is called a Pythagorean 
triple. 


Example 8.2 For n = 10, in the decomposition into even and odd numbers: 
Sı = {1,3,5,7,9}, S2 = {2,4,6, 8, 10}, 


there are no Pythagorean triples in Sı but {6, 8, 10} in Sz is a Pythagorean triple 
since 6° + 8° = 10°. 


Marijn J.H. Heule and Oliver Kullmann proved the following theorems. Their 
method of proof is discussed in Sect. 8.6. 


Theorem 8.3 Foralln < 7824, there is some decomposition of S(n) into two disjoint 
subsets such that both subsets do not contain a Pythagorean triple. 
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Theorem 8.4 For all n > 7825, in all decompositions of S(n) into two disjoint 
subsets at least one subset contains a Pythagorean triple. 


It is impossible to check all 278*° decompositions of 5(7825). If we could check 
one decomposition every microsecond, 2785 microseconds ~ 106° years, while the 
estimated age of the universe is only about 10!° years. 


8.3 Van der Waerden’s problem 


Consider the sequences of eight colored dots in Fig. 8.1. In the top sequence there 
are red dots at positions (1, 2,3) and blue dots at positions (4,5, 6). In each case, 
the positions form an arithmetic progression. Similarly, in the middle sequence 
the red dots at positions (1,3,5) form an arithmetic progression. However, in the 
bottom sequence there is no set of three monochromatic dots whose positions form 
an arithmetic progression. Triples of red dots are at positions (1,2,5), (1, 2,6), 
(2,5, 6), none of which form arithmetic progressions, and similarly for the blue dots. 


Fig. 8.1 van der Waerden’s problem for eight colored dots 


With nine dots any coloring must contain a sequence of three monochromatic dots 
that form an arithmetic progression. For example, let us add a red dot or a blue dot at 
the end of the bottom sequence in Fig. 8.1 to obtain the sequences in Fig. 8.2. In the 
top sequence there are red dots at positions (1,5, 9), an arithmetic progression, and 
in the bottom sequence there are blue dots at positions (7, 8,9), also an arithmetic 
progression. 

Bartel L. van der Waerden posed the following problem: For any positive integer 
k, what is the smallest number n such that any sequence of n colored dots must 
contain a sequence of k monochromatic dots that form an arithmetic progression? 
For k = 3, n = 9, as demonstrated above for one decomposition. The next result is 
more difficult to show: for k = 4, n = 35. 
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1 2 3 4 5 6 7 8 9 
oeoeoeoeeeoe#eoe ®@ 


Fig. 8.2 van der Waerden’s problem for nine colored dots 


8.4 Ramsey’s Theorem 


Color the edges of Ks, the complete graph on 5 vertices, with two colors as shown in 
Fig. 8.3a. There are no monochromatic subgraphs K3 (triangles) in the graph. Fig- 
ure 8.3b shows one coloring of K6 and it is easy to see that there are monochromatic 
triangles AACE and ABDF. In this section we prove a simple case of a theorem by 
Frank P. Ramsey on the existence of subsets with a certain property. 


Fig. 8.3a A coloring of Ks with two colors Fig. 8.3b A coloring K6 with two colors 


Definition 8.3. R(k), the Ramsey number for k, is the smallest number n such that 
in any coloring of Kn, the complete graph on n vertices, with two colors there is a 
monochromatic complete subgraph Kx. 


Theorem 8.5 (Ramsey) R(3) = 6. 


Proof Figure 8.3a shows that R(3) > 5. To show that R(3) < 6, consider any vertex 
v in K6. v is connected to five other vertices, and when the edges are colored with 
two colors there must be at least three monochromatic edges incident with v. 


94 8 Ramsey Theory 


Fig. 8.4a One vertex of K6 Fig. 8.4b Monochromatic triangles in K6 


In Fig. 8.4a, AB, AC, AE are colored red. Since the graph is complete all the 
vertices are connected, so if any one of the edges BC, BE, CE is colored red, say 
BE, a red triangle AABE is formed. Otherwise, all three edges of these edges are 


colored blue and they form a blue triangle (Fig. 8.4b). oO 


The theorem can be generalized to any number of colors, as well as to colorings 
where the sizes of the subgraphs are not the same. R(r, b, g) is the smallest complete 
graph such that in any coloring with three colors there must be complete subgraphs 
with r red edges, b blue edges and g green edges. 


8.5 The Probabilistic Method 


The only known non-trivial Ramsey numbers are R(3) = 6 and R(4) = 18. In 1947 
Paul Erdős developed the probabilistic method and used it to show lower and upper 
bounds on R(k). Subsequent research has improved both bounds, but this is still 
a significant research area since the bounds are not tight. For example, it has been 
proved that 43 < R(5) < 48 and 798 < R(10) < 23556. In this section elementary 
probability is used to obtain a lower bound on R(k). 

To show that there exists an element of a set S that has property A, prove that the 
probability of a random element of S having property A is greater than zero. It is 
important to understand that the method is non-constructive: it just proves that such 
an element exists but does not construct one. Although from Thm. 8.5 we know that 
R(3) = 6, let us use the probabilistic method to obtain a lower bound for R(3). 


8.5 The Probabilistic Method 95 
Theorem 8.6 (Erdős) R(3) > 4. 


Proof Given a random coloring of K, by the two colors red and blue, consider an 
arbitrary subgraph K3, that is, an arbitrary triangle with (5) = 3 sides. The probability 
that all sides are colored red is 27°, as is the probability that all sides are colored 
blue, so the probability that the triangle is monochromatic is 27? +27? = 27? = 1/4. 
The number of triangles in K, is (5) so P(n,3), the probability that some triangle 
contained in a random coloring of K, is monochromatic, is: 


P(n, 3) = 4 i 


If P(n,3) < 1 then its complement P(n,3) = 1 — P > 0, that is, the probability that 
a random coloring of K, does not contain a monochromatic triangle is greater than 
zero, So at least one must exist. 

The following table shows P(n, 3) for several values of n, and whether the value 
of P(n, 3) proves that there exists a coloring with no monochromatic triangle: 


n  P(n,3) Exists 


3 3/4 yes 


4 5/6 yes 
5 =3/7 — 


oO 


At first glance the result is strange because Fig. 8.3a shows that there exists a 
coloring of Ks with no monochromatic coloring. However, the probabilistic criterion 
is sufficient but not necessary; it is a lower bound, meaning that R(n) > 4 which is 
true because Thm. 8.5 showed that R(n) = 6. 

The same proof works for arbitrary k, so the probability of the existence of a 
coloring of K, with no monochromatic complete graph Kx, is: 


P(n, k) = H 2-27-G), 


peoa) -Cj 


P(6,4) = (32 — 15)/32 = 17/32 
P(7,4) = (32 — 35)/32 = —3/32. 


For k = 4: 


If follows that R(4) > 6 which is much less than the known value R(4) = 18. 
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8.6 SAT Solving 


SAT solving is a method for solving problems that works by encoding a problem as a 
formula in propositional logic and then using a computer program to check the truth 
value of the formula. Advances in algorithms and implementations have made SAT 
solving a viable approach for problem solving. We give an overview of SAT solving 
and explain how it can be used to solve the mathematical problems described in this 
chapter. The reader is assumed to have an elementary knowledge of propositional 
logic as summarized in Def. 8.4. 


8.6.1 Propositional Logic and the SAT Problem 


Definition 8.4 


e A formula is composed of atomic formulas or atoms connected by the proposi- 
tional operators V (disjunction, “or’’), A (conjunction, “and’’), — (negation, “not’). 

e A formula is given an interpretation by an assignment of T or F to each atom. 
Evaluating a formula in an interpretation results in its truth value T or F. 

e A formula is satisfiable if and only if there is an interpretation that makes its truth 
value T. Otherwise, the formula is unsatisfiable. 

e A formula is in conjunctive normal form (CNF) if and only if it is composed of 
a conjunction of subformulas each of which is a disjunction of literals (atoms or 
negations of atoms). 


The following formula is in CNF: 
(ap VqVrr) A pyr) A Gr) A (pV qgvar). 


The SAT problem is to decide if a given formula in CNF is satisfiable or not. 
A SAT solver is a computer program that can solve the SAT problem. Most SAT 
solvers are based on the DPLL algorithm which goes back to the 1960’s, but recent 
developments have made very significant improvements to the algorithm. Highly 
optimized implementations of these algorithms have made SAT solvers an important 
tool for solving problems in many fields including mathematics. 
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8.6.2 Schur triples 


Let us encode the Schur triples problem S(8) as a formula in CNF. The formula will 
be satisfiable if and only if there is a decomposition of a set S into disjoint subsets 
S1, S2 such that neither S$; nor S2 contains a Schur triple. There is an atom p; for 
each of the numbers | < i < 8. The intended meaning of an interpretation for the 
formula is that it assigns T to p; if i is in the first subset S4 and it assigns F to p; if i 
is in the second subset S2. To show that in all decompositions neither subset contains 
a Schur triple, the interpretation must ensure that for each possible Schur triple at 
least one atom is assigned T and one atom is assigned F. 

For example, {2,4, 6} is a Schur triple so at least one of the three integers must 
be in Sı and at least one of them must be in S2. Therefore, p2 V p4 V pe must be true 
and also =p2 V a=p4 V ~pe must be true. There are 12 possible Schur triples so the 
CNF formula is: 


(ap V ap2 V ap3) A 
(=p, V aps V mp4) A 
(=p, V ap4 V aps) A 
(ap, V aps V ape) A 
(pı V ape V mp7) A 
(pı V ap7 V aps) A 
(p2 V aps V aps) A 
(p2 V ap4 V ape) A 
(p2 V aps V ap7) A 
(p2 V ape V aps) A 
(p3 V ap4 V ap7) A 
(ps V aps V =ps). 


(pı V p2 V ps) 
(p1 VY p3 V pa) 
(pı VY p4 V ps) 
(pı V Ps V Po) 
(p1 V Po V p7) 
(pı Y p7V ps) 
(p2 V p3 V ps) 
(p2 V Pa V po) 
(p2 V Ps V p7) 
(p2 V Po V Ps) 
(p3 V p4 V p7) 
(p3 V Ps V ps) 


(8.3) 


> > > T e a a a em 


When a SAT solver is given this formula it answers that the formula is satisfiable 
under either of the interpretations: 


Pı P2 P3 P4 P5 Po P7 P8 
F F T F T T T F 
De OE EOE Be dP OE A 


One interpretation corresponds to the decomposition in Eq. 8.2: Sı = {1,2,4, 8}, 
S2 = {3,5,6,7}, while the other corresponds to the symmetrical decomposition 
Sı = {3,5,6,7}, So = {1,2,4, 8}. 
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For S(9), four pairs of subformulas are added for the additional possible triples: 


(p1 V ps V po) ^ (api V apg V apo) A 
(p2 V p7V po) ^ (=p2 V ap7 V apo) A 
(p3 V Po V po) ^A (=p3 V “p6 V apo) A 
(p4 V ps V po) ^A (apa V aps V apo). 


When the SAT solveris given this formula, it answers that the formula is unsatisfiable, 
meaning that no decomposition has no Schur triple. Removing the double negative, 
this states that in every decomposition of S(9) there exists a Schur triple. 


8.6.3 Pythagorean Triples 


Heule and Kullmann solved the Pythagorean triple problem using a highly optimized 
SAT solver. There was a significant difference in efficiency between finding a decom- 
position that does not have Pythagorean triples (you just need one decomposition), 
and showing all that decompositions have a Pythagorean triple (you have to check 
all of them). To show that for all S(n), 1 < n < 7824, there is a decomposition 
with no triple took only one minute of computing time, whereas to show that every 
decomposition of $(7825) has a triple took about two days of computing time for a 
computer with 800 cores (processors) working in parallel, altogether 40, 000 hours 
of computing time. 

The use of computers in mathematics naturally raises the question: Can we trust 
a proof generated by a computer? Of course, even “ordinary” mathematical proofs 
can be incorrect (Sect. 4.7), but our experience with frequent computer bugs, as well 
as the opaqueness of large computer programs, makes us more sensitive to potential 
errors in computer-generated proofs. 

One approach to increasing confidence in the correctness of a computer-generated 
proof is to use two or more programs, written independently by two or more re- 
searchers. If the multiple programs are written in different programming languages 
and for different computers and operating systems, this lessens the possibility of a 
bug in the computer hardware and software. 

Heule and Kullmann’s SAT solver wrote out a log of the steps in the proof so that it 
could be examined for correctness. The log was so massive, 200 terabytes, that it was 
impossible to examine directly. To put this into perspective, 200 terabytes is 200,000 
gigabytes while your computer might have an internal memory of 16 gigabytes and 
a solid-state disk of 128 gigabytes. Instead, they wrote a small program to verify the 
correctness of the data in the log. To ensure that this program was correct, they wrote 
a formal proof using the Coq proof assistant that supports and checks the work of 
mathematicians without totally automating the proof process. 
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8.6.4 An Overview of the DPLL Algorithm 


The first algorithm that one learns for SAT solving is truth tables. Given a for- 
mula A in propositional logic with n different atoms, there are 2” interpretations 
since each atom can be independently assigned T or F. For each interpretation it is 
straightforward to compute the truth value of A using the definition of the propo- 
sitional operators. However, to check 2” interpretations is very inefficient for even 
moderately large n. 

The DPLL algorithm works by incrementally assigning T or F to an atom and 
then attempting to evaluate the formula. For example, given A = p Aq A-r, if p 
is assigned F then A evaluates to F, regardless of the assignments to q and r, and 
there is no need to perform further evaluations. Similarly, A = p V q V ~ar evaluates 
to T if p is assigned T, regardless of the assignments to q and r. 

The efficiency of DPLL comes from unit propagation. Consider part of the 
formula for Schur triples: 


(p1 V p2 V p3) A (api V mp2 V aps) A 
(pi V p3 V pa) A (p1 V aps V mp4) ^ 
(p3 V p4 V p7) A (>p3 V apa V ap7) ^ 

(p3 V ps V ps) A (p3 V aps V aps) - 


Suppose that we have assigned F to p1, p2. The first subformula reduces to the unit 
formula consisting of the single atom p3. If the formula is to be satisfied, we must 
assign T to p3 and all the subformulas: 


(pı V p2 V p3), (pı V p3 V p4), (P3 V pa V p7), (p3 V ps V ps), 


immediately evaluate to T. 

Since —p3 evaluates to F, each subformula containing ~p3 can be satisfied only 
if some other literal in the subformula is assigned T. In =p3 V aps V aps, either p5 
or pg must be assigned F so that either ~p5 or —~ps evaluates to T. 

This analysis shows that once p1, p2 have been assigned F, the formula in Eq. 8.4 
is satisfiable if and only if (~ p4 V a=p7) A (aps V aps) is satisfiable. By performing 
the propagation of p3 on all the subformulas of Eq. 8.3, the formula is reduced to: 


(p4 V ps) A (p4V po) A (ps V po) A (ps V pI) A 
(p6 V p7) ^ (pe V ps) A (p7V ps) A 
(p4 V ap7) A (aps V aps). 


One more assignment of F to p4 results in a satisfying interpretation which we have 
found after only three arbitrary assignments. 
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8.7 Pythagorean Triples in Babylonian Mathematics 


This section is a digression from Ramsey theory; it is included to give a taste of 
the rich theory of Pythagorean triples and to demonstrate the depth of mathematical 
knowledge in the ancient world. Pythagorean triples were known in Babylonian 
mathematics since at least 1800 BCE. 


Definition 8.5 A primitive Pythagorean triple is a set of three positive integers 
{a, b, c} such that a? + b? = c? and a, b, c have no common factor greater than 1. 


Example 8.3 {3,4,5} is a primitive Pythagorean triple but {6, 8, 10} is a Pythagorean 
triple that is not primitive since 2 is a common factor. 


A cuneiform tablet called Plimpton 322 is one of the earliest examples of Babylonian 
mathematics. It lists fifteen primitive Pythagorean triples by giving a and c. Table 8.1 
displays four of these triples, together with the computed values of b and other 
values that will be discussed below. Historians of mathematics have proposed several 
explanations for how these triples were found. One explanation is that Euclid’s 
formula was used to obtain the triples from a pair of generating numbers. 


Theorem 8.7 (Euclid) {a, b, c} is primitive Pythagorean triple if and only if there 
exist two positive integers u,v, called generating numbers, such that: 


lu>v 

2. they are not both odd 

3. they have no common factor greater than 1 

4. the following relations hold between {a, b, c} and u, v: 


a=u-v’, b = 2uv, c=u +v. 


Proof By computation it follows immediately that if {a, b, c} can be expressed as 
required in item 4 they form a Pythagorean triple: 


Table 8.1 Babylonian triples from the Plimpton 322 tablet 


a factors b Dyactors c u Ufactors v Vyactors 

119 7-17 120 23.3.5 169 12 22-3 5 5 
4601 43 - 107 4800 26.3.5? 6649 75 3-5? 32 2 
12709 71-179 13500 2.33.53 18541 125 53 54 2.33 


65 5-13 72 23. 3? 97 9 3? 4 2? 
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a’ +b? = (u? — v?’ + (uv)? 
= uf — 2(uv)? + vf + 4(uv)? 
= uf +2(uv)* + v4 
=Rv e. 


The proof of the other direction is more complicated and is omitted. m 


If it is true that the Babylonians used Euclid’s formula, the question remains: How 
did they discover the generating numbers u, v? 

Each row of Table 8.1 displays factors and Dyactors, the factorizations of a and 
b, respectively, showing that they have no common factors. The reader is invited 
to check that c has no common factor with a,b so the triples are primitive. The 
generating numbers u, v and Uyactors, Vfactors are also displayed. Not only do they not 
have any common factors as required by Thm. 8.7, but the only factors greater than 
1 in u and v are powers of 2, 3,5. 


Definition 8.6 A Babylonian triple is a primitive Pythagorean triple such that the 
only prime factors of u, v are 2,3, 5. 


The reason that the Babylonians restricted themselves to these factors is that they 
used the sexagesimal or base 60 = 2 - 2-3-5 number system whose prime factors 
are 2,3 and 5. 

For readers who are not familiar with non-decimal number systems, here is a brief 
overview of the concept. The “number” 12345 is a shorthand for the number: 


(1 x 10*) + (2 x 10°) + (3 x 10°) + (4 x 10!) + (5 x 10°). 


This number system is called the decimal or base 10 number system. There are 

ten digits 0,1,2,...,8,9 for the coefficients of the powers, and the powers are 

represented by the places of coefficients with powers increasing from right to left. 
The number could also be represented in the binary or base 2 number system by: 


12345 = 8192+40964+324+164+841 = 2!342!742542442342° = 11000000111001 . 


Binary notation uses two digits 0, 1 for the coefficients and the powers of two are 
indicated by the places of the coefficients. 

Another popular number system is the hexadecimal or base 16 number system 
which is used in computing. For this number system we need 16 “digits” and the 
convention is to use 0,1,2,...,8,9, A, B,C, D, E, F. 

The base 60 number system is not as unfamiliar as it may seem, because we rep- 
resent time, geographical coordinates and angles in that system. We are comfortable 
carrying out computations such as (1 hour 40 minutes) plus (1 hour 30 minutes) 
equals (3 hours 10 minutes). 
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Table 8.2 Babylonian triples in base 60 


a C 


(1) (59) (2)(49) 
(1)(16)(41) (1) (50) (49) 
(3)(31)(49) (5) {09){01) 

(1) (05) (1)(37) 


Table 8.2 shows the values of a,c that appear in the tablet in base 60 notation 
where (d) represents the d’th “digit” for O < d < 60. The reader can check that these 
values are the same as the decimal values given in Table 8.1, for example: 


(3 x 607) + (31x60!) + (49 x 60°) = 12709 
(5x 602) + (9x60!) + (1x 60°) = 18541 


The Babylonians did not have 60 distinct symbols for the digits. Instead, they 
used a hybrid system where the coefficients were represented with two symbols: one 
for the tens coefficient and the other for the ones coefficient, and the places of the 
coefficients indicated the powers of 60. Using 9 for the tens coefficient and ¢ for 
the ones coefficient, the decimal number (38 x 60) + (16 x 60°) = 2296 would be 


represented as: 


9000 OO 
FVV $9099 V OHO0N. . 


What Is the Surprise? 


Frank P. Ramsey’s theorem appeared to be a minor result in combinatorics. Sur- 
prisingly, the theorem was the foundation of an entirely new and challenging field 
of mathematics with many open problems. The nature of Ramsey theory is also 
surprising: if a set is large enough there exist regularities in its subsets. 

I was introduced to Ramsey theory by the article by Marijn J. H. Heule and 
Oliver Kullmann on Pythagorean triples whose proof bears some similarity to the 
proof of the four-color theorem: the use of massive computing resources that is only 
successful after theoretical advances. Hence the title of their article: The Science of 
Brute Force. 

Problems in combinatorics ask for specific numerical values, for example, R(n) 
must be a specific positive integer. It is surprising that probabilistic methods have 
proved so fruitful in obtaining results in this field. 
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We tend to think that humans are smarter today then they used to be thousands 
of years ago. It can be a surprise to find out that four thousand years ago Babylonian 
mathematics was sufficiently advanced to discover that {12709, 13500, 18541} is a 
Pythagorean triple. 


Sources 


For an overview of Ramsey theory see [9], while an advanced presentation can be 
found in [20]. The section on the probabilistic method is based on [43, Example 40] 
and [9, Chapter 4]. A database of Ramsey numbers can be found in [34]. 

The method of proof of the theorem on Pythagorean triples is explained in detail 
in [23]. See [4] for an introduction to logic and to SAT solving. The archive of my 
SAT solver for education [5] contains formulas for Schur triples, Ramsey graphs and 
van der Waerden’s problem. 

Section 8.7 is based upon [60], [42]. The sexagesimal number system is described 
in [63]. 
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Chapter 9 isa 
Langford’s Problem 


C. Dudley Langford noticed that his son had arranged colored blocks as shown in 
Fig. 9.1. There is one block between the red blocks, two blocks between the blue 
blocks and three blocks between the green blocks. 


Fig. 9.1 Layout of blocks for Langford’s problem 


Definition 9.1 (Langford’s Problem L(7)) Given the multiset! of positive integers: 
{1,1,2,2,3,3,...,n,n}, 


can they be arranged in a sequence such that for 1 < i < n there are i numbers 


between the two occurrences of i? 


Figure 9.1 shows that 312132 is a solution for L(3). 


Section 9.1 restates Langford’s problem using a mathematical formalism that 
facilitates solving the problem. Section 9.2 characterizes values of n for which 
L(n) is solvable and presents two proofs of the theorem. The first proof which is 
relatively simple uses the technique of double-counting: counting the same value in 
two different ways and equating the resulting formulas. The second proof is a clever 
induction but the “bookkeeping” involved requires careful attention to the details. 
Section 9.3 works out the solution for L(4). 


1 A multiset or bag is like a set except that there may be more than one occurrence of an element. 
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9.1 Langford’s Problem as a Covering Problem 


Langford’s problem can be posed using an array. For L(3) there are six columns, 
one for each position at which the six numbers can be placed. There is one row for 
each possible placement of one of the numbers, that is, the two occurrences of k 
must have k numbers between them. There are four possible placements of 1’s, three 
of 2’s and two of 3’s: 


1);2);3)4)5]6 
1} 1 1 
2 1 1 
3 1 1 
4 1 1 
5 |} 2 2 
6 2 2 
7 2 2 
8 |) 3 3 
9 3 3 


To solve the problem we need to select one row for the 1’s in the sequence, one row 
for the 2’s and one row for the 3’s, such that if we stack these rows on top of each 
other, no column contains more than one number. 

Row 9 needed not be considered because of symmetry: starting with row 9 just 
gives the reversal of the sequence obtained by starting with row 8. 

Row 8 is the only one containing 3’s so it must be chosen and the sequence is 
3-3. Any row with numbers in columns 1 and 5 can no longer be used, because 
only one number can be placed at each position. Let us denote the permissible and 
forbidden rows by: 

1,2, 2,4, 3, 6,7,8. 


Row 7 is the only remaining row containing 2’s so it must be chosen and the 
sequence is 3.2.32. Deleting rows that can no longer be used gives: 


A, 2, 3, A, Í, 6, 7, 8 $ 
Choosing the only remaining row, row 2, gives the solution 312132: 
Jı]2J]4]5]6 
2 1 1 
7 2 2 
8 |) 3 3 


The analysis has shown that this is the only solution, except for the symmetrical 
solution obtained by starting with row 9. 
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9.2 For Which Values of N Is Langford’s Problem Solvable? 


Theorem 9.1 L(n) has a solution if and only if n = 4k orn = 4k +3. 


We prove the forward direction of the theorem. Proof 1 shows that if L(n) has a 
solution then n = 4k or n = 4k + 3. Proof 2 shows the contrapositive: if n = 4k + 1 
or n = 4k +2 then L(n) has no solution. 


Proof (1) If the first occurrence of the number k is at position ig, the second oc- 
currence is at position ig + k + 1. For example, in 312132, the solution for L(3), 
choosing k = 2 gives ix = 3 andik +k+1=34+2+1=6. 

Sn, the sum of the positions of all the numbers, is: 


ep S rere 


k=l k=1 
yee 1) 
k=1 k=l 


Faery saa 


=2 
k=l 2 
But Sn is simply 1+2+3+---+2n, so: 
m O 2n(2n +1) 
Sn = k= 7 oe b 
k=1 


Equating the two formulas for S;, gives: 


“1.  n(n+3) 2n(2n+1) 
2X it = = E 


k=1 2 


2 
y . 1 (2n(2n+1) n(n+3) 
l = = 
aK a\ 2 2 


o 3n? =n 
= p~ 


The left-hand side is an integer since it is the sum of integers (the positions), so 
the right-hand side must also be an integer. When is 3n? —n divisible by 4? Factoring 
3n? — n gives n(3n — 1). 

If n is a multiple of 4, the product is divisible by 4. 

When is 3n — 1 divisible by 4? Any integer n can be expressed as n = 4i + j for 
j =0,1,2,3. If 3n — 1 is divisible by 4, then so is 3(4i + j) — 1 = 12i + 37 — 1. 12i 
is divisible by 4. For j = {0, 1,2,3}, 37 — 1 = {-1, 2,5, 8} is divisible by 4 if and 
only if j = 3, that is, n = 4i +3. oO 
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To introduce the idea of the second proof consider what a solution for n = 4 might 
look like. In the following tables the positions of the occurrences of 4 are 1 and 6, 
and the positions of the occurrences of 2 are 5 and 8. In both cases, one position is 
odd and the other is even. 


1[2[3[4[5[6|]7]/8 17/2/3/4]/516/7]8 
A) 4S (Tee lals Aa |S) Tiere sia 


Let k = 2m be an even number. If i is the position of the first occurrence of k, 
then the position of the second occurrence is i+ k + 1. The sum of the positions is: 


i+(i+k+1) =2i+2m+1=2(i+m) +1, 


which is an odd number. For the sum of two numbers to be odd, one must be odd 
and the other even. 

Let us now check the positions of the occurrences of odd numbers. The positions 
of the occurrences of | are 2 and 4, both even numbers, and the positions of the 
occurrences of 3 are 3 and 7, both odd numbers. 


1[2][3]4]/5]6]7/8 Pie sas eas 
Aq SS ee al ANS la Sa a2 
* * * * 


Let k = 2m + 1 be an odd number. The sum of the positions is: 
i+(it+tk+1)=2i+2m4+14+1=2(i+m+1), 


which is an even number. For the sum of two numbers to be even, both must be odd 
or both even. 

The positions 1,2,...,2m — 1,2n contain an equal number of even and odd 
positions. The two occurrences of a number in a row “cover” two positions. When 
the set of rows covers all the positions, they must cover an equal number of even 
positions and odd positions. Define the parity of a set of rows to be the difference 
between the number of even and odd positions covered. Initially, the parity is zero, 
and if the problem has a solution, the set of rows in the solution also has zero parity. 

When two occurrences of an even number are placed, they cover one even position 
and one odd position, so the parity remains the same: 


1 [2]3]4]5] 6 [7[8 1/2/3/4]5 16/7 


4/1/;3/1/2] 4 )3)2 4;1]/3};1]2 141/3] 2 
-1 +1 -1 +1 
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When two occurrences of an odd number are placed, the parity becomes +2 or —2, 
so we must be able to associate this pair with a pair of occurrences of another odd 
number that are placed at positions that balance out the parity: 


213] 4 |5]6|]7/8 1l2]3l4l5l6]7 
aia [3]l1ı1l2ļ]41312 4|ıl3 lıl2ļ4]3 12 
+1 +1 -1 -1 


We have shown that there can be a solution to Langford’s problem if and only if there 
is an even number of odd numbers in {1,...,n}! The theorem claims that if this is 
true then either n = 4k or n = 4k — 1, and if not then either n = 4k — 2 or 4k — 3. 


Proof (2) The proof is by induction. There are four base cases: 


e n= 4k —3 = 1. In {1} there is an odd number of odd numbers and there is no 
solution. 

e n=4k—2=2. In {1,2} there is an odd number of odd numbers and there is no 
solution. 

e n=4k-1=3. In {1,2,3} there is an even number of odd numbers and we have 
seen that there is a solution. 

e n=4k —0. In {1,2,3,4} there is an even number of odd numbers and Sect. 9.3 
gives a solution. 


The inductive hypothesis is that the theorem is true for {1,...,4k — j}, k = 1, 
0 < j < 3, and we will prove that it is true for n = 4(k + 1) — J. 


e Add 4k +1 = 4(k + 1) —3 to {1,...,4k}. By the inductive hypothesis for 
4k = 4k — 0 there is an even number of odd numbers. 4(k + 1) — 3 is odd so there 
is now an odd number of odd numbers and there is no solution. 

e Add 4k +2 = 4(k +1) -2 to {1,...,4k + 1}. By the inductive hypothesis for 
4k +1 =4(k+1)-3 there is an odd number of odd numbers. 4(k + 1) — 2 is even 
so there is still an odd number of odd numbers and there is no solution. 

e Add 4k +3 = 4(k +1) — 1 to {1,...,4k +2}. By the inductive hypothesis for 
4k +2 =4(k +1) —2 there is an odd number of odd numbers. 4(k + 1) — 1 is odd 
so there is an even number of odd numbers and a solution likely exists. 

e Add 4k +4 =4(k +1) -Oto {1,2,...,4k +3}. By the inductive hypothesis for 
4k +3 = 4(k + 1) — 1 there is an even number of odd numbers. 4(k + 1) — 0 is 
even so there is an even number of odd numbers and a solution likely exists. oO 
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9.3 Solution for L (4) 


Here is the array for L(4). Try to find the solution yourself. 


1/12/3]415]6/7]8 
1 | 1 1 
2 1 1 
3 1 1 
4 1 1 
5 1 1 
6 1 1 
7 |2 2 
8 2 2 
9 2 2 
10 2 2 
11 2 
12 || 3 3 
13 3 3 
14 3 3 
15 3 3 
16 || 4 4 
17 4 4 
18 4 4 


By symmetry row 18 may be eliminated. 


Choose row 16 and the sequence is 4-4 ... Any row with an element in position 
1 or position 6 can no longer be part of the solution. 

1,2,3, 4,5, 6, 7,8, 9, 10, 11, 72, V3, 14, 15, 16, y7 
Choose row 14 and the sequence is 4-343.. 


1,2, 3, 4, B, 6, 7,8, 9, YO, 11, y2, V3, 14, YS, 16, y7 
Choose row 8. The sequence is 423.243.. 


1, 2, 2, A, B, $, 7,8, 9, YO, V1, Y2, V3, 14, Y5, 16, y7 


All of the choices for 1’s have been eliminated so we must backtrack. 


Instead of row 8 choose row 11 and the sequence is 43.2432. 


1,2, 3, 4, B, 6, 7, B, 9, YO, 11, ¥2, y3, 14, Y5, 16, y7 


Choose row 2 and we have a solution 413124372. 
Continue backtracking to see if there is another solution. 


Instead of row 14 choose row 15 and the sequence is 4.3.4.3. 


A, 2,3,4,5, 6, 7,8, 9, VO, V1, V2, ¥3, ¥4, 15, 16, y7 
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Row 8 must be chosen and the sequence is 42_324.3. 


1, 2, B.A, B, $, 7,8, 9, YO, V1, Y2, V3, V4, 15, 16, y7 


All of the choices for 1’s have been eliminated so again we backtrack. 


Instead of row 16 choose row 17 and the sequence is 4.4... 
1, 2,3,4, 3, 6,7, 8,9, YO, 11, 12, V3, V4, 15, Y6, 17 

Choose row 15 and the sequence is 4.3.43. 
1, 2,3, 4, B, 6, 7, 8,9, VO, V1, Y2, Y3, ¥4, 15, y6, 17 

Row 9 must be chosen and the sequence is _423_243. 


1, 2, B, 4, B, $, 7, B,9, YO, V1, Y2, V3, V4, 15, Y6, 17 


All of the choices for 1’s have been eliminated. We can backtrack one last time. 


Instead of row 15 choose row 12 and the sequence is 34__3.4. 


1, 2, 2, A, B, $, 7, 8,9, YO, V1, 12, y3, V4, Y5, y6, 17 


Again, all of the choices for 1’s have been eliminated. 


Therefore the only solution is 41312432. 


What Is the Surprise? 


The source of the inspiration for a mathematical theorem can be surprising. Langford 
noticed a pattern in his son’s colored blocks which led to the interesting Thm. 9.1. 
Students should also be introduced to the fact that a theorem can have many com- 
pletely different proofs. 


Sources 


This chapter is based on [35]. [12] shows how to find a solution for n = 4k and 
n=4k +3. 
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Chapter 10 wedi 
The Axioms of Origami 


Origami, the art of paper folding, was developed several centuries ago in Japan and 
now has a worldwide following. In the late twentieth century the mathematical theory 
of origami was developed. Its foundation is a set of seven axioms, the Huzita—Hatori 
axioms, named after Humiaki Huzita who formalized the first six axioms and Koshiro 
Hatori who found the seventh. Jacques Justin published all seven axioms several years 
before Huzita and Hatori, and Margherita P. Beloch formulated the sixth axiom in 
1936. Nevertheless, the axioms as known as the Huzita-Hatori axioms. 

In a sequence of three chapters we will explore the mathematics of origami. 
This chapter presents the axioms, Chap. 11 connects origami with the roots of 
polynomials and Chap. 12 shows that constructions with origami can solve problems 
that are impossible using a straightedge and compass. 

This chapter contains a section for each of the seven axioms. Following a statement 
of an axiom and a diagram of the fold it specifies, the equations of the fold and the 
points of intersection are developed using analytic geometry. A fold can also be 
defined as a geometric locus, the set of all points satisfying some property. The term 
fold comes from the origami operation of folding a piece of paper, but here it is used 
to refer the geometric line that would be created by folding the paper. 

Folds result in reflections. Given a point p, its reflection around a fold / results in a 
point p’ such that / is the perpendicular bisector of the line segment pp’ (Fig. 10.1). 


Fig. 10.1 The fold is the perpendicular bisector of the line connecting a point and its reflection 
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10.1 Axiom 1 


Axiom 10.1 Given two distinct points pı = (x1, y1), p2 = (x2, y2), there is a unique 
fold / that passes through both of them (Fig. 10.2). 


N 
` 
` 
` 
` 
` 
‘ 

` 

` 
` 
D 
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Fig. 10.2 Axiom 1 


Derivation of the equation of the fold: The equation of the fold / is derived from 
the coordinates of pı and p2. The slope is the quotient of the differences of the 
coordinates and the intercept is derived from p1: 

277 


y-a Agg. (10.1) 
X2 — X1 


Example 10.1 Let pı = (2,2), p2 = (6,4). The equation of / is: 


4-2 
-2= -2 
y z302 
1 
y=>x+l1. 


2 
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10.2 Axiom 2 


Axiom 10.2 Given two distinct points pı = (x1, y1), p2 = (x2, y2), there is a unique 
fold / that places pı onto po (Fig. 10.3). 


The fold is the geometric locus of all points equidistant from pı and p2. 


` p2 


Fig. 10.3 Axiom 2 


Derivation of the equation of the fold: The fold / is the perpendicular bisector of 
P1P2. Its slope is the negative reciprocal of the slope of the line connecting pı and 
p2. l passes through the midpoint between the points: 


_yity2 x27%*1 k- =”) 


= (10.2) 
2 yo-y1 2 


Example 10.2 Let pı = (2,2), p2 = (6,4). The equation of / is: 


»- (4) --3 (AS) 


y=-2x+11. 
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10.3 Axiom 3 


Axiom 10.3 Given two lines l4, l2, there is a fold / that places /; onto / (Fig. 10.4). 


The fold is the geometric locus of the points that are equidistant from /; and l2, 
where the distance from a point to a line is the length of the line segment through the 
point that is perpendicular to the line. Using congruent triangles it is easy to show 
that the fold is a bisector of the angle formed by l; and l2. 


Fig. 10.4 Axiom 3 


Derivation of the equation of the fold: 

lı, l are parallel: Let l; be y = mx + bı and let l2 be y = mx + bo. The fold is the 
line parallel to /; and l2 that is halfway between them: 

+ bi + bo 
= mx + ———. 
7 2 

L, l intersect: Let l be y = mıx + by and let l be y = m2x + bo. pi = (xi, yi), the 
point of intersection of the two lines, is: 
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mx; + by = mx; + bo 
_ b-b; 

i m -m 

yi = Mıxi + b1. 


Example 10.3 Let 1, be y = 2x — 2 and let l2 be y = —x + 8. Then p; = (xi, yi) is: 


Be 10. 
oze g A 
10 14 
1=2: = -2=— x 4.67 

: 3 3 


The fold is the bisector of the angle formed by /; and l} at their point of intersection. 
There are two possible folds since there are two pairs of vertical angles. We need 
to determine the slopes of the angle bisectors. If the angle of line /; relative to the 
x-axis is 6; and the angle of line lz relative to the x-axis is 62, then the fold is the 
line which makes an angle of 6, = (@; + 62)/2 relative to the x-axis. 

Let mı = tan 6), m2 = tan 02. By Thm. A.9, ms, the slope of the line making an 
angle of 0; + @ relative to the x-axis, is: 


tan 6; + tan 62 mı + m2 


m, = tan(@; +02) = = . 
( l 2) 1 — tan 8; tan 62 1- mym 


By Thm. A.10, mp, the slope of the angle bisector, is: 


6,+6. -1+1 +tan?(01 +02) -1+41+m? 
mp = tan = = : 
2 tan(@, + 62) Ms 


Example 10.4 For y = 2x — 2 and y = —x + 8, the slope of the angle bisector is: 


ME ee 
Le A S 
-1 + VI+ 1/3} 
mp = ZENU = -3 + VTO x -6.16, 0.162. 


Let us derive the equation of the fold /y, with the positive slope. From Ex- 
ample 10.3, the coordinates of the intersection of the two lines are (10/3, 14/3). 
Therefore: 


14 10 
3 = (-3+ VLD) ae ei 
p,- 44- 10V10 

2a 3 


44- 10V1 
y = (-3 + V10)x + HENE ~ 0.162x + 4.13. 
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10.4 Axiom 4 


Axiom 10.4 Given a point pı and a line l4, there is a unique fold / perpendicular to 
1, that passes through point p, (Fig. 10.5). 


The fold is the geometric locus of all points on the line perpendicular to lı that 
passes through pı. 


Fig. 10.5 Axiom 4 


Derivation of the equation of the fold: Let /; be y = m,x+, and let pı = (x1, y1). 
l is perpendicular to /; so its slope is —(1/m1). Since it passes through pı we can 
compute the intercept b and write down its equation: 


1 
y=-—%X1 +b 
m 


= (my, +x1) 
m 
1 + 
yae yp a. 


Example 10.5 Let pı = (2,6) and let /; be y = 2x — 4. The equation of the fold / is: 


1 2-642 1 
y=-—-x+ SE eds 
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10.5 Axiom 5 


Axiom 10.5 Given two points p1, p2 and a line 44, there is a fold / that places pı 
onto /; and passes through p2 (Fig. 10.6). 


Since the fold passes through p2 and p2 is on the perpendicular bisector of pip}, 
the geometric locus of the reflection of p; is the circle centered at pọ with radius 
P1P2. The fold is constrained so that the reflection p{ is on the given line /). 


Fig. 10.6 Axiom 5 


Derivation of the equations of the folds: Let /,; be y = m,x+b, and let pı = (x1, y1), 
p2 = (x2, y2). The equation of the circle centered at p2 with radius pj pz is: 


(x - x2)? +(y- yo)" =r’, where 


r = (x2-41)? + (y2-y1). 
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Substituting the equation of /; into the equation for the circle gives: 
(x — x2)? + ((mix + bi) - y2)? =r? 


(x — x2)? + (mix + (bi - y) =P’, 


and we obtain a quadratic equation for the x-coordinates of the possible intersections: 


x (1 + m?) + 2(-x2 + mı (b — y2))x + Gs + (bt — 2biy2 + y5) — r’) =0. 
(10.3) 


Since a quadratic equation has at most two solutions, for a given pair of points and a 
line there may be zero, one or two folds. From the solutions x}, x7 we can compute 
yi> y7 from y = mıx + bı. The reflected points are pi = (x), y1) PY = (x7, y1). 


Example 10.6 Let pı = (2,8), p2 = (4,4) and let l; be y = -4x+ 3. The equation of 
the circle is (x—4)*+(y—4)? = (4-2)? + (4-8)? = 20. Substitute the equation of the 
line into the equation of the circle to obtain a quadratic equation for the x-coordinates 
of the intersections (or use Eq. 10.3): 


1 Z 
(x-4)? + ((-3+3] -4] = 20 


2 
-4+0 (54) -20=0 


5x? — 28x -12 =0 
(5x + 2)(x -6) =0. 


The two points of intersection are: 


pi = (-2/5, 16/5) = (-0.4,3.2), p{ = (6,0). 


The folds will be the perpendicular bisectors of pip‘ and pip’. 
Example 10.7 For pı = (2,8) and p = (—2/5, 16/5) the equation of l j is: 
—8+6/5) __(-2/5)-2 | _ _ 2+ (-2/5) 
A 2 ` (16/5)-8 2 


1 
y= SiE. 


Example 10.8 For pı = (2,8) and p? = (6,0) the equation of l, is: 


ala meme rs a 


8+0 6-2 2+6 
2 
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10.6 Axiom 6 


Axiom 10.6 Given two points p1, p2 and two lines l1, l2, there is a fold / that places 
pı onto lı and places pz onto ly (Fig. 10.7). 


A fold that places p; onto l; is a line p such that the distance from p; to lp is 
equal to the distance from /; to lp. The geometric locus of points equidistant from a 
point p; and a line l; is a parabola. p; is called the focus and lı is called the directrix. 
A fold is any line tangent to the parabola (Sect. 10.6.3). 


Fig. 10.7 Axiom 6 


For a fold to simultaneously place pı onto l; and p2 onto l}, it must be a tangent 
common to the two parabolas. There may be zero, one, two or three common tangents 
(Figs. 10.8a, 10.8b, 10.9a, 10.9b). 

The formula for an arbitrary parabola is quite complex so we limit the presentation 
to parabolas whose axis of symmetry is the x- or y-axis. 
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Fig. 10.9a Two common tangents Fig. 10.9b Three common tangents 


10.6.1 Derivation of the Equation of a Fold 


Let (0, f) be the focus of a parabola with directrix y = d. Define p = f — d, the 
signed length of the line segment between the focus and the directrix.! If the vertex 
of the parabola is on the x-axis the equation of the parabola is y = x?/2p. To move 
the parabola up or down the y-axis so that its vertex is at (0, A), add h to the equation 
of the parabola (Fig. 10.10): 


x2 


= — +h. 
y 2p 


1 We have been using the notation p; for points; the use of p here might be confusing but it is the 
standard notation. The formal name for p is one-half the latus rectum. 
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focus 


(0, f) = (0,4) ¢ R 


Wey at 


vertex p=6 
(0, 1) [a= 


X-axis 


directrix y = —2 


y-axis 


Fig. 10.10 The elements in the definition of a parabola 


Define a = 2ph so that the equation of the parabola is: 


2 


X a 
= 10.4 
y 2p * 2p (10.4a) 
x -2py+a=0. (10.4b) 


The equation of the parabola in Fig. 10.10 is x? — 12y + 12 = 0. 
Substitute the equation of an arbitrary line y = mx + b into Eq. 10.4b to obtain 
an equation for the points of intersection of the line and the parabola: 


x —2p(mx +b) +a =0 
x?” +(—2mp)x + (-2pb +a) =0. 


The line will be tangent to the parabola if and only if this quadratic equation has 
exactly one solution if and only if its discriminant is zero: 


(-2mp)? — 4-1-(-2pb+a) =0 (10.5a) 
mp’ +2pb-a=0. (10.5b) 
This is an equation with variables m, b for the tangents to the parabola. To obtain 


the common tangents to both parabolas we must simultaneously solve the equations 
for the two parabolas. 
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Example 10.9 
Parabola 1: Focus (0, 4), directrix y = 2, vertex (0,3). 
p=2,a=2-2-3 = 12. The equation of the parabola is: 


x*-—4y+12=0. 
Substituting p and a into Eq. 10.5b and simplifying gives: 
m’ +b-3=0. 


Parabola 2: Focus (0, —4), directrix y = —2, vertex (0, —3). 
p = —2, a = 2 . —2 - —3 = 12. The equation of the parabola is: 


x?’ +4y+12=0. 
Substituting p and a into Eq. 10.5b and simplifying gives: 
m’ -b-3=0. 
The solutions of the two equations: 


m +b-3=0 
m -b-3=0 


are m = +V3 x +1.73 and b = 0. There are two common tangents: 


Example 10.10 

Parabola 1: Unchanged. 

Parabola 2: Focus (0, —6), directrix y = —2, vertex (0, —4). 
p=-—4,a=2--4--4 = 32. The equation of the parabola is: 


x? +8y+32=0. 
Substituting p and a into Eq. 10.5b and simplifying gives: 
2m? -b-4=0. 
The solutions of the two equations: 


m+b-3=0 
Im -b-4=0 
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7 2 
are m = efi x +1.53 and b = 7 There are two common tangents: 


Example 10.11 

Let us now define a parabola whose axis of symmetry is the x-axis. 
Parabola 1: Unchanged. 

Parabola 2: Focus (4, 0), directrix x = 2, vertex (3,0). 
p=2,a=2-2-3 = 12. The equation of the parabola is: 


y —4x+12=0. (10.6) 


This is an equation with x and y? instead of x? and y, so Eq. 10.5b can’t be used and 
we must perform the derivation again. 
Substitute the equation for a line into Eq. 10.6: 


(mx + b)? —4x+12=0 
mx? + (2mb —4)x + (b° + 12) =0. 


Set the discriminant equal to zero and simplify: 


(2mb — 4)? — 4m?(b* +12) =0 


-3m ~mb+1=0. 
If we try to solve the two equations: 


m+b-3=0 


-3m —mb+1=0, 
we obtain a cubic equation with variable m: 
m? — 3m? —3m+1=0. (10.7) 


Since a cubic equation has at least one and at most three real solutions, there can be 
one, two or three common tangents. 

The formula for solving general cubic equations is quite complicated, so I used a 
calculator on the internet and obtained the three solutions: 


m = 3.73, m = —1, m=0.27. 
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From the form of Eq. 10.7 we might guess that m = 1 or m = —1 is a solution: 


P-3-1°-3-14+1=-4 
(-1)? -3-(-1)?-3-(-1)+1=0. 


Divide Eq. 10.7 by m — (—1) = m + 1 to obtain the quadratic equation m? — 4m + 1 
whose roots are the other two solutions of the cubic equation m = 2+ V3 ~ 3 .73, 0.27. 


10.6.2 Derivation of the Equations of the Reflections 


We derive the position of the reflection p| = (xj, y{) of pı = (xı, yı) around a 
tangent line l, whose equation is y = m;x + b,. First, find the line /,, with equation 
y = MpX + bp that is perpendicular to l, and passes through p1: 


Next find the intersection p; = (xr, yr) of l; and lp: 


Xt xX] 
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Yt = m,X;, + by a 
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p: is the midpoint between p; and p}: 
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Example 10.12 Let l, be y = 3x + 0 and let pı = (0,4): 


ee 
4+ — -0 
nei 3 
B+] 
ieee 
y, = V3V3 +0 =3 
xi = Day — xı = 2V3 x 3.46 


yi =2yr -y1 =2. 


10.6.3 Tangents to a Parabola 


We wish to prove that the folds of Axiom 6 are tangents to the parabolas. Figure 10.11 
shows five points p;, i = 1,...,5, each point p; at a distance a; from both the focus 
and the directrix. Drop perpendicular lines from p; to the directrix and denote the 
intersections of these lines with the directrix by p}. By Axiom 2 there are folds l; 
through p; that place p onto the directrix. The points p; are the reflections of p 
around the folds. The figure shows the fold /; through pı and the reflection p}. 


Pi . directrix y=-f 
y-axis 


Fig. 10.11 The tangent as a geometric locus 
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Fig. 10.12 The proof that the fold is a tangent 


Theorem 10.7 The folds of Axiom 6 are the tangents to the parabolas that are the 
loci of the points equidistant to the points pı, p2 and lı, l2, respectively. 


Proof In Fig. 10.12, the focus is p and the directrix is d. p’ is a point on the directrix 
and / is the fold that reflects p onto p’. Let s be the intersection of pp’ and J. Then 
ps = p's =a and l L pp’ since l is the perpendicular bisector of pp’. 

Let r be the intersection of the line perpendicular to d through p’ and the fold 
L. Then Apsr = Ap’sr by side-angle-side. It follows that pF = p'r = b sor isa 
point on the parabola. Choose a point p” on the directrix that is distinct from p’ 
and assume that the fold / also reflects p onto p”. Let q be the intersection of the 
perpendicular to d through p” 
Denote gp” = e. If q is a point on the parabola then e = qp” = Gp = c, but c is the 
hypotenuse of the right triangle Agp’’ p’ and it is not possible that the hypotenuse is 
equal to one of the other sides of the right triangle. Therefore the fold / has only one 
intersection with the parabola and must be a tangent. o 


and the fold /. Apsq = Ap’sq so pq = p'q = c. 


10.7 Axiom 7 


Axiom 10.8 Given a point pı and two lines /; and Jy, there is a fold / that places pı 
onto /; and is perpendicular to l2 (Fig. 10.13). 


The fold is the geometric locus of all points on the line perpendicular to l2 and 
equidistant from pı and p{, the reflection of pı onto l4. 
Derivation of the equation of the fold: Let pı = (x1, y1), let l4 be y = mıx + bı 
and let l2 be y = m2x + bp. Let lp be the line containing pipi. Since / L h, lp L l, 
it follows that /,, || /2 and the equation of lp is y = m2x + bp. 
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Fig. 10.13 Axiom 7 
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lp passes through p; so yı = m2x,+b, and its equation is y = max + (yı — m2x1). 


The reflection pi = (x1, y|) is the intersection of l; and lp: 


, , 
mıx + by = mxi + (yı — 2x1) 
, _ yı- Mx, — by 
xy = 
mı — m2 
$ , 
yy = mx, +b). 


The equation of the midpoint Pm = (Xm, Ym) Of Lp is: 


( yá xi +x yı+yí 
Xm, Ym — 2 > 2 
l 1 l and it passes through pm so its equation is: 
1 
y =-—x + bm, 
m2 
1 
where bm can be computed from y = —-—x + bm: 
mz 


Xm 
bm = Ym +t —. 
m2 
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The equation of the fold / is therefore: 


Example 10.13 Let pı = (5,3), let l; be y = 3x — 3 and let l2 be y = —x + 11. Then: 
p 3-(-1)-5-(-3) ll 
. E = 


' 3=(-1) 4 
,, Ul 21 
eer) ar 
ll, 21 
Se ee eee 
Be ye eS 


What Is the Surprise? 


Origami, the art of paper folding, has been practiced for hundreds of years, so it 
is surprising that the mathematical formalization goes back only to the twentieth 
century. It is even more surprising that there is an axiomatization of paper folding. 
The mathematics of origami is an excellent way to learn analytic geometry, properties 
of parabolas and the concept of geometric locus. 


Sources 


The axioms of origami are presented in [56]. Lang [26] gives descriptions of origami 
constructions. [31, Chap. 10] contains the detailed theory of the mathematics of 
origami, including the proof that two parabolas can have zero, one, two or three 
common tangents. The proof of Thm. 10.7 was shown to me by Oriah Ben-Lulu. I 
found that geometric software like Geogebra is useful for understanding the relation 
between the geometry and the algebra of the axioms. 

A clear presentation of cubic equations can be found in [6, Chapters 1, 2]. 
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Chapter 11 in 
Lill’s Method and the Beloch Fold 


11.1 A Magic Trick 


Construct a path consisting of four line segments {a3 = 1, az = 6, a; = 11, ap = 6}, 
starting from the origin along the positive direction of the x-axis and turning 90° 
counterclockwise between segments. Construct a second path as follows: construct 
a line from the origin at an angle of 63.4° and mark its intersection with az by P. 
Turn left 90°, construct a line and and mark its intersection with a; by Q. Turn left 
90° once again, construct a line and note that it intersects the end of the first path at 
(-10, 0) (Fig. 11.1). 


Fig. 11.1 A magic trick 
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Compute the negation of the tangent of the angle at the start of the second path: 
— tan 63.4° = —2. Substitute this value into the polynomial whose coefficients are 
the lengths of the segments of the first path: 


p(x) = a3x? + aox? + ax + ao 
=x + 6x7 +11x+6 
p(—tan 63.4°) = (-2)? + 6(-2)? + 11(-2) +6 =0. 


We have found a root of the cubic polynomial x? + 6x? + 11x + 6! 
Let us continue the example. The polynomial p(x) = x? + 6x? + 11x +6 has three 
roots —1, —2, —3. Compute the arc tangent of the negation of the roots: 


æ =—tan7!(-1) = 45°, B=-tan7!(-2) ~ 63.4°, y =—tan!(-3) ~ 71.6°. 


For each angle the second path intersects the end of the first path (Fig. 11.2). 

The value — tan 56.3 ~ —1.5 is not a root of the equation. Fig. 11.3 shows the 
result of the application of the method for this angle. The second path does not 
intersect the line segment for the coefficient ap at (—10, 0). 

This example demonstrates a method discovered by Eduard Lill in 1867 for 
graphically finding the real roots of any polynomial. We are not actually finding the 
roots but verifying that a given value is a root. 

Section 11.2 presents a formal specification of Lill’s method (limited to cubic 
polynomials) and gives examples of how it works in special cases. A proof of the 


Fig. 11.2 Lill’s method for the three roots of the polynomial 
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Fig. 11.3 A path that does not lead to a root 


correctness of Lill’s method is given in Sect. 11.3. Section 11.4 shows how the 
method can be implemented using origami Axiom 6. This is called the Beloch fold 
and preceded the formalization of the axioms of origami by many years. 


11.2 Specification of Lill’s Method 


11.2.1 Lill’s Method as an Algorithm 


3 2 


e Start with an arbitrary cubic polynomial p(x) = a3x° + a2x^ + a,x + ao. 


e Construct the first path: 


— For each coefficient a3, a2, a1, a9 (in that order) construct a line segment of 
that length, starting at the origin O = (0,0) in the positive direction of the 
x-axis. Turn 90° counterclockwise between each segment. 


e Construct the second path: 


— Construct a line from O at an angle of 0 with the positive x-axis that intersects 
az at point P. 

— Turn +90° and construct a line from P that intersects a; at Q. 

— Turn +90° and construct a line from Q that intersects ao at R. 

— If R is the end point of the first path then — tan 8 is a root of p(x). 
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Fig. 11.4 Lill’s method with negative roots 


e Special cases: 


— When constructing the line segments of the first path, if a coefficient is negative, 
construct the line segment backwards. 

— When constructing the line segments of the first path, if a coefficient is zero, 
do not construct a line segment but continue with the next +90° turn. 


e Notes: 


— The phrase intersects a; means intersects the line segment a; or any extension 
of qj. 

— When building the second path choose to turn left or right by 90° so that there 
is an intersection with the next segment of the first path or its extension. 


11.2.2 Negative Coefficients 


Let us demonstrate Lill’s method on the polynomial p(x) = x? — 3x? — 3x + 1 with 
negative coefficients (Sect. 10.6). Start by constructing a segment of length 1 to the 
right. Next, turn 90° to face up, but since the coefficient is negative, construct a 
segment of length 3 down, that is, in a direction opposite of the arrow. After turning 
90° to the left, the coefficient is again negative, so construct a segment of length 3 
to the right. Finally, turn downwards and construct a segment of length | (Fig. 11.4, 
the loosely dashed lines will be discussed in Sect. 11.2.4). 
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7 


Fig. 11.5 Lill’s method with polynomials with zero coefficients 


Start the second path with a line at 45° with the positive x-axis. It intersects the 
extension of the line segment for az at (1, 1). Turning —90° (to the right), the line 
intersects the extension of the line segment for a; at (5, —3). Turning —90° again, 
the line intersects the end of the first path at (4, —4). Since — tan 45° = —1, we have 
found a root of the polynomial: 


p(-1) = (-1)? - 3(-1)? - 3(-1) +6 =0. 


11.2.3 Zero Coefficients 


an, the coefficient of the x” term in the polynomial x3 —7x —6 = 0, is zero. Construct 
a line segment of length 0, that is, do not construct a line, but still make the +90° turn 
as indicated by the arrow pointing up at (1, 0) in Fig. 11.5. Turn again and construct 
a line segment of length —7, that is, of length 7 backwards, to (8,0). Finally, turn 
once more and construct a line segment of length —6 to (8, 6). 

The second paths with the following angles intersect the end of the first path: 


—tan-!(-1) = 45°, —tan™!(—2) = 63.4°, -—tan™!3 ~-71.6°. 
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Fig. 11.6 Lill’s method with non-integer roots 


We conclude that there are three real roots {-1, —2, 3}. Check: 


(x + 1)(x +2)(% — 3) = (x? + 3x +2)(x - 3) =x -7x-6. 


11.2.4 Non-integer Roots 


Figure 11.6 shows Lill’s method for p(x) = x° — 2x + 1. The first path goes from 
(0,0) to (1,0) and then turns up. The coefficient of x? is zero so no line segment is 
constructed and the path turns left. The next line segment is of length —2 so it goes 
backwards from (1,0) to (3,0). Finally, the path turns down and a line segment of 
length 1 is constructed from (3,0) to (3,-1). 

It is easy to see that if the second path starts at an angle of —45° it will intersect 
the first path at (3, —1). Therefore, — tan~'(—45)° = 1 is a root. If we divide p(x) by 
x — 1, we obtain the quadratic polynomial x? + x — 1 whose roots are: 


-1 + V5 


~ 0.62, —1.62. 
7 0.62, 6 


There are two additional second paths: one starting at — tan™! 0.62 ~ —31.8°, and 
the other starting at — tan~!(—1.62) ~ 58.3°. 

The polynomial p(x) = x3—3x?—3x+1 (Sect. 11.2.2) has roots 2+ V3 ~ 3.73, 0.27. 
The corresponding angles are — tan™! 3.73 ~ —75° and —tan7!0.27 ~ —15° as 
shown by the loosely dashed lines in Fig. 11.4. 
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Fig. 11.7 The cube root of two 


11.2.5 The Cube Root of Two 


To double a cube, compute V2, a root of the cubic polynomial x? — 2. In the 
construction of the first path, turn left twice without constructing any line segments, 
because az and a, are both zero. Then turn left again (to face down) and construct 
backwards (up) because ag = —2 is negative. The first segment of the second path is 
construct at an angle of — tan~! ¥2 ~ —51.6° (Fig. 11.7). 


11.3 Proof of Lill’s Method 


The proof is for monic cubic polynomials p(x) = x° + a2x? + a,x + ao. If the 
polynomial is not monic, divide it by a3 and the resulting polynomial has the same 
roots. In Fig. 11.8 the line segments of the first path are labeled with the coefficients 
and with bo, bj, a2 — b2, a; — bj. In a right triangle if one acute angle is 8 the other 
angle is 90° — 0. Therefore, the angle above P and the angle to the left of Q are equal 
to 6. Here are the formulas for tan 6 as computed from the three triangles: 


b 
tan 8 = = =b 
b b 
tan 0 = l = 1 
a2—b2 a-tanð 
tan 0 = ae A 


aı— bı a, — tan 8 (a2 — tan 0) ` 
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Fig. 11.8 Proof of Lill’s method 


Simplify the last equation, multiply by —1 and absorb —1 into the powers: 


(tan 6)? — az(tan 6)? + aj (tan 8) — ao = 0 


(—tan 8)? + a(—tan 0)? + a,(—tan@) + ao =0. 


It follows that — tan 8 is a real root of p(x) = x? + anx? + ax + ao. 


11.4 The Beloch Fold 


Margharita P. Beloch discovered a remarkable connection between folding and Lill’s 
method: one application of the operation later known as origami Axiom 6 generates 
a real root of a cubic polynomial. The operation is often called the Beloch fold. 

Consider the polynomial p(x) = x? + 6x? + 11x +6 (Sect. 11.1). Recall that a fold 
is the perpendicular bisector of the line segment between any point and its reflection 
around the fold. We want RS in Fig. 11.9 to be the perpendicular bisector of both 
QO’ and PP’, where Q’, P’ are the reflections of Q, P around RS, respectively. 

Construct a line ay parallel to az at the same distance from az as az is from P, 
and construct a line a parallel to a; at the same distance from a, as a, is from Q. 
Apply Axiom 6 to simultaneously place P at P’ on a, and to place Q at Q’ ona}. 
The fold RS is the perpendicular bisector of the lines PP’ and QQ’ so the angles at 
R and S are both right angles as required by Lill’s method. 
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Fig. 11.9 The Beloch fold for finding a root of x? + 6x? + 11x+6 


Figure 11.10 shows the Beloch fold for the polynomial x* — 3x? — 3x + 1 
(Sect. 11.2.2). az is the vertical line segment of length 3 whose equation is x = 1, 
and its parallel line is a, whose equation is x = 2, because P is at a distance of 1 
from a2. a, is the horizontal line segment of length 3 whose equation is y = —3, 
and its parallel line is a whose equation is y = —2 because Q is at a distance of 1 
from a;. The fold RS is the perpendicular bisector of both PP’ and QQ’, and the 
line PRSQ is the same as the second path in Fig. 11.4. 
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Fig. 11.10 The Beloch fold for finding a root of x3 — 3x? — 3x + 1 


What Is the Surprise? 


Performing Lill’s method as a magic trick never fails to surprise. It can be performed 
during a lecture using graphics software such as GeoGebra. It is also surprising that 
Lill’s method, published in 1867, and Beloch’s fold, published in 1936, preceded the 
axiomatization of origami by many years. 


Sources 


This chapter is based on [8, 24, 40]. 
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Chapter 12 updates | 
Geometric Constructions Using Origami 


This chapter shows that constructions with origami are more powerful than construc- 
tions with a straightedge and compass. We give two constructions for trisecting an 
angle, one by Hisashi Abe (Sect. 12.1) and the other by George E. Martin (Sect. 12.2), 
two constructions for doubling a cube, one by Peter Messer (Sect. 12.3) and the other 
by Marghareta P. Beloch (Sect. 12.4), and the construction of a nonagon, a regular 
polynomial with nine sides (Sect. 12.5). 


12.1 Abe’s Trisection of an Angle 


Construction: Given an acute angle ZPQR, construct p, the perpendicular to QR 
at Q. Construct q, a perpendicular to p that intersects PQ at point A, and construct 
r, the perpendicular to p at B that is halfway between Q and A. Using Axiom 6 
construct the fold Z that places A at A’ on PQ and Q at Q’ on r. Let B’ be the 
reflection of B around /. Construct lines through QB’ and QQ’ (Fig. 12.1). 


Theorem 12.1 2PQB’ = ZB’QQ’ = LQ'QR = £PQR/73. 


Proof (1) A’, B’, Q’ are reflections around the line / of the points A, B, Q on the line 
DQ, so they are on the reflected line DQ’. By construction AB = BQ, LABQ’ = 
ZQBQ’ = 90° and BQ’ is a common side, so AABQ’ = AQBQ?’ by side-angle-side. 
Therefore, ZAQQ’ = ZQAQ’ = a so AAQ’Q is isoceles (Fig. 12.2). 

By reflection AAQ’Q = AA’QQ’, so AA’QQ’ is also an isoceles triangle. 
OB’, the reflection of Q’B, is the perpendicular bisector of an isoceles trian- 
gle, so ZA’QB’ = ZQ’QB’ = LQQ’B = a. By alternating interior angles, 
ZQ’QR = £QQ'B = a. Together we have: 


APQB' = ZA'QB' = £B'OO’ = LQ'QR =e. g 
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Fig. 12.1 Abe’s trisection of an angle 


Proof (2) Since | is a fold it is the perpendicular bisector of QQ’. Denote the 
intersection of / with QQ’ by U and its intersection with QB’ by V (Fig. 12.2). 
AVUQ = AVUQ" by side-angle-side since VU is a common side, the angles at U 
are right angles and QU = Q’U. Therefore, 4VQU = ZVQ’U = a and ZQ'QR = 
ZVQ'U = a by alternating interior angles. 

As in the first proof A’, B’,Q’ are all reflections around l, so they are on the 
line DQ’ and A’B’ = AB = BO = B’Q’ = a. Then AA’B’O = AQ’B’O by 
side-angle-side and 2A’QB’ = ZQ’QB’ =a. o 
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D 


Fig. 12.2 Proofs of Abe’s trisection (U, V are used in Proof 2) 


12.2 Martin’s Trisection of an Angle 


Construction: Given an acute angle ZPQR, let M be the midpoint of PQ. Construct 
p the perpendicular to QR through M and construct q perpendicular to p through M 
so q || OR. Using Axiom 6 construct the fold / that places P at P’ on p and Q at Q’ 
on q. If more than one fold is possible choose the one that intersects PM. Construct 
PP’ and QQ’ (Fig. 12.3). 


Theorem 12.2 2Q’QR = £PQR/3. 


Proof Denote the intersection of QQ’ with p by U and its intersection with / by V. 
Denote the intersection of PO and P’Q’ with / by W. It is not immediate that PQ 
and P’Q’ intersect / at the same point. But APWP’ ~ AQW(Q’ so the altitudes bisect 
both vertical angles ZPWP’, ZQWQ’ and they must be on the same line. 

AQMU = APMP’ by angle-side-angle since ZP’PM = ZUQM = £ by alternate 
interior angles, QM = MP = a because M is the midpoint of PQ and ZQMU = 
LPMP” = y are vertical angles. Therefore, P'M = MU = b. 
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Fig. 12.3 Martin’s trisection of an angle 


AP’MQ’ = AUMQ' by side-angle-side, since P’M = MU = b, the angles at 
M are right angles and MQ’ is a common side. Since the altitude of the isoceles 
triangle AP’Q’U is the bisector of ZP’Q’U, it follows that 2P’Q’M = ZUQ’M =a. 
Furthermore, 2UQ’M = 2Q’QR = aby alternate interior angles. AQWV = AQ’WV 
by side-angle-side since QV = VQ’ = c, the angles at V are right angles and VW is 
a common side. Therefore: 


LWQOV = 8 = ZLWOQ’V =2a 
ZPQR=Bt+a=3a. Oo 


12.3 Messer’s Doubling of a Cube 


A cube of volume V has sides of length WV. A cube with twice the volume has sides 
of length Y2V = V2VV, so if we can construct V2 we can multiply by the given 
length VV to double the cube. 


Construction: Divide the side of a unit square into thirds as follows: Fold the square 
in half to locate the points J = (0, 1/2) and J = (1, 1/2). Next, construct the lines 
AC and BJ (Fig. 12.4). The point of intersection K = (2/3, 1/3) can be obtained by 
solving the two equations y = 1 — x and y = x/2. 
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A=(0,) - D= (1,1) 


G = (0, 2/3) H = (1, 2/3) 


I = (0, 1/2) 4 J = (1, 1/2) 


E = (0, 1/3) F = (1, 1/3) 


. 


(2/3, 1/3). 


B= (0,0) be “dC = (1,0) 


Fig. 12.4 Dividing a length into thirds 


Construct EF, the perpendicular to AB through K, and construct the reflection 
GH of BC around EF. The side of the square has now been divided into thirds. 

Using Axiom 6 place C at C’ on AB and F at F’ on GH. Denote by L the point 
intersection of the fold with BC and denote by b the length of BL. Rename the length 
of the side of the square to a + 1 where a = AC’. The length of LC is (a + 1) — b 
(Fig. 12.5). 


Theorem 12.3 AC’ = V2. 


Proof When the fold is performed the line segment LC is reflected onto the line 
segment LC’ and CF is folded onto the line segment C’ F”. Therefore: 


a+1 2a-1 


GC’ =a- 12.1 
a- 3 (12.1) 
Since ZFCL is a right angle, so is ZF’C’L. 
AC’ BL is a right triangle so by Pythagoras’s Theorem: 
1? +b? =((a+1)-b)* (12.2a) 
2 
2 
po. (12.2b) 
2(a +1) 


LGC'F' + LF'C’L+ZLC’B = 180° since they form the straight line GB. Denote 
LGC’ F’ by a. Then: 
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A A 
a+1 
3 
a 
X 
2a — 1 
3 
v Vv 


Fig. 12.5 Construction of v2 


ZLC’B = 180° — ZF’C’L — ZGC’F’ = 180° - 90° -@ = 90° - a, 


which we denote by a’. The triangles AC’BL, AF’GC’ are right triangles so 
ZC’LB =a and ZC’ F’G =a’. Therefore, AC’BL ~ AF’GC’ and: 


BL GC’ 
CL CF 
Using Eq. 12.1 we have: 
2a- 1 
b 3 


(a+1)-b atl 


Substituting for b using Eq. 12.2b gives: 


a° +2a 
2(a+1) _2a-l1 
242a a+1l ` 
OEE E +2a 
2(a+1) 


Simplify the equation to obtain a? = 2 and a = V2. o 
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12.4 Beloch’s Doubling of a Cube 


Since the Beloch fold (Axiom 6) can solve cubic equations it is reasonable to 
conjecture that it can be used to double a cube. Here we give a direct construction 
that uses the fold. 


Construction: Let A = (—1,0), B = (0,—2). Let p be the line x = 1 and let q be 
the line y = 2. Use the Beloch fold to construct the fold / that places A at A’ on p 
and B at B’ on q. Denote the intersection of the fold and the y-axis by Y and the 
intersection of the fold and the x-axis by X (Fig. 12.6). 


Theorem 12.4 OY = V2. 


Proof The fold is the perpendicular bisector of both AA’ and BB’ so AA’ || BB’. 
By alternate interior angles YAO = ZBXO = a. The labeling of the other angles 
in the figure follows from the properties of right triangles. 


` y-axis ay 


X-axis 


(0, -2) 


Fig. 12.6 Beloch’s doubling of the cube 
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AAOY ~ AYOX ~ AXOB and OA = 1, OB = 2 are given so: 


OY OX OB 
OA OY OX 
OY OX 2 

1 OY OX 


From the first and second ratios we have OX = oY and from the first and third 
ratios we have OY OX = 2. Substituting for OX gives OY =2andOY=VY2. o 


12.5 Construction of a Regular Nonagon 


A nonagon (a regular polygon with nine sides) is constructed by deriving the cubic 
equation for its central angle and then solving the equation using Lill’s method and 
the Beloch fold. The central angle is 6 = 360°/9 = 40°. By Thm. A.6: 


cos 36 = 4cos? 6 — 3 cos 0. 


Let x = cos 40°. Then for the nonagon the equation is 4x* — 3x + (1/2) = 0 since 
cos3 - 40° = cos120° = —(1/2). Figure 12.7 shows the paths for the equation 
constructed according to Lill’s method. 


Fig. 12.7 Lill’s method for a nonagon 


The second path starts from P at an angle of approximately —37.45°. Turns of 
90° at R and then —90° at S cause the path to intersect the first path at its endpoint 
Q. Therefore, x = — tan(—37.45°) ~ 0.766 is a root of 4x? — 3x + (1/2). 
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The root can be obtained using the Beloch fold. Construct the line a’, parallel to 
az at the same distance from a2 as az is from P. Although the length of az is zero, 
it still has a direction (upwards) so the parallel line can be constructed. Similarly, 
construct the line aji parallel to a; at the same distance from a; as a; is from Q. 
The Beloch fold RS simultaneously places P at P’ on a, and Q at Q’ on a}. This 
constructs the angle ZSPR = —37.45° (Fig. 12.8). 


Fig. 12.8 The Beloch fold for solving the equation of the nonagon 


By Lill’s method — tan(—37.45°) ~ 0.766 and therefore cos 6 ~ 0.766 is a root 
of the equation for the central angle 6. We finish the construction of the nonagon by 
constructing cos~! 0.766 ~ 40°. 

The right triangle AABC with ZCAB ~ 37.45° and AB = 1 has opposite side 
BC ~ 0.766 by definition of tangent (Fig. 12.9a). Fold CB onto the AB so that the 
reflection of C is D and DB = 0.766. Extend DB and construct E so that DE = 1. 
Fold DE to reflect E at F on the extension of BC (Fig. 12.9b). Then: 


-10.766 | 


ZBDF = cos 40°. 
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Fy v., 
c "e 
$C % 
0.766 1 
z . j 40° 
A D B “E 
e 0.766 ° 
< 1 > 
< 1 > < 1 > 
Fig. 12.9a The tangent that is the solution of Fig. 12.9b The cosine of the central angle of 
the equation for the nonagon the nonagon 
What Is the Surprise? 


We saw in Chaps. 2 and 3 that tools such as the neusis can perform constructions 
that cannot be done with a straightedge and compass. It is surprising that trisecting 
an angle and doubling a cube can be constructed using only paper folding. Roger 
C. Alperin has developed a hierarchy of four methods of construction each more 
powerful than the previous one. 


Sources 


This chapter is based on [2, 26, 31, 36]. 
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Chapter 13 uodas) 
A Compass Is Sufficient 


In 1797 Lorenzo Mascheroni proved that any construction carried out with a straight- 
edge and compass can be carried out with only a compass. Later it came to light 
that this theorem had already been proved by Georg Mohr in 1672. After explaining 
in Sect. 13.1 what is meant by performing a construction with only a compass, the 
proof is presented in stages starting with four auxiliary constructions: reflection of a 
point (Sect. 13.2), construction of a circle with a given radius (Sect. 13.3), addition 
and subtraction of line segments (Sect. 13.4) and construction of a line segment as 
a ratio of segments (Sect. 13.5). Section 13.6 shows how to find the intersection of 
two lines and Sect. 13.7 shows how to find the intersection of a line and a circle. 


13.1 What Is a Construction With Only a Compass? 


Figure 13.1a shows the construction of an equilateral triangle using a straightedge and 
compass. How can we construct a triangle without the line segments AB, AC, BC? 
A line segment is defined by two points, so it is sufficient to construct these points in 
order to obtain a construction equivalent to the one with a straightedge (Fig. 13.1b). 
There is no need to actually see the line segments. There will be lines in the figures 
in this chapter, but they are used only to understand the construction and the proof of 
its correctness. It is important to convince yourself that the construction itself uses 
only a compass. 

A construction using a straightedge and compass is a sequence of three operations: 


e Find the point of intersection of two lines. 
e Find the point(s) of intersection of a line and a circle. 
e Find the point(s) of intersection of two circles. 


The third operation can be done with only a compass. We need to show that the first 
two operations can be done with a compass alone. 
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C C 
A B A B 
Fig. 13.1a Construction of an equilateral tri- Fig. 13.1b Construction of an equilateral tri- 
angle with a straightedge and a compass angle with only a compass 


Notation: 


e c(O, A): the circle with center O through point A. 
e c(O,r): the circle with center O and radius r. 
e c(O, AB): the circle with center O and radius the length of line segment AB. 


13.2 Reflection of a Point 


Definition 13.1 A point C’ is a reflection of the point C around a line segment AB 
if and only if AB (or the line containing AB) is the perpendicular bisector of the line 


segment CC’. 


Theorem 13.1 Given a line AB and a point C not on AB, it is possible to build C’, 
the reflection of C around AB. 


Proof Construct a circle centered on A passing through C and a circle centered on 
B passing through C. The other intersection of the two circles is the point C’ which 
is the reflection of C (Fig. 13.2). AABC = AABC’ by side-side-side since AC, AC’ 
are radii of the same circle, as are BC, BC’ and AB is a common side. Therefore, 
ZCAB = ¿C'AB so AB is the angle bisector of ZCAC’. But ACAC’ is an isosceles 
triangle and the angle bisector AB is also the perpendicular bisector of CC’, the base 
of ACAC’. By definition C’ is the reflection of C around AB. o 
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Fig. 13.2 Construction of a reflection 


13.3 Construction of a Circle With a Given Radius 


Theorem 13.2 Given points A, B,C it is possible to construct c(A, BC), the circle 
centered at A with radius BC. 


Proof Construct c(A, B) and c(B, A) and let X,Y be their points of intersection 
(Fig. 13.3). A is the reflection of B around XY since AYAX = AYBX by side-side- 
side. By Thm. 13.1 construct C’, the reflection of C around XY and then construct 
c(A, AC’) (Fig. 13.4). 

XY is the perpendicular bisector of CC’ and AB. Denote the intersection of 
XY and AB by D and the intersection of XY and CC’ by E. Then C’E = EC, 
AD = DB and ZDEC = LDEC'’ is aright angle, so ADEC = ADEC’ by side- 
angle-side. Therefore, DC = DC’ and ZADC’ = ZBDC (they are complementary 
to ZEDC’ = LEDC), It follows that AADC’ = ABDC by side-angle-side so 
AC’ = BC, o 


Fig. 13.3 Construction of a circle with a given radius (1) 
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Fig. 13.4 Construction of a circle with a given radius (2) 


13.4 Addition and Subtraction of Line Segments 


Theorem 13.3 Given a line segment PQ of length a and a line segment RS of length 
b, it is possible to construct line segments OT, OU such that PTQU is aline segment, 
the length of PT is a — b and the length of PU is a + b (Fig. 13.5). 


P T Q U 

e + + + S 
< a > i 
<— a-b — b 

< a+b > 


Fig. 13.5 Addition and subtraction of line segments 


The proof is quite long and will be presented as a sequence of constructions. 


Theorem 13.4 An isoceles trapezoid can be constructed. 


Proof Let H be any point on c(Q, b). Construct H’ its reflection around PQ. Denote 
the length of HH’ by h (Fig. 13.6). 


Construct the circles c(H, b), c(Q, h). Let K be a point of intersection of the 
circles and construct K’ the reflection of K around PO (Fig. 13.7). 
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q 


Fig. 13.6 Construction of an isoceles trapezoid (1) 


The line containing PQ is the perpendicular bisector of HH’ and KK’ so HH’ || 
KK’. KH = b since it is the radius of the circle centered on H, and K’, H’ are 
reflections of K, H. AQQ’H = AQQ’H' by side-side-side and AKQH = AK’QH’ 
by side-angle-side, so K’H’ = KH = b. It follows that KHH’K’ is an isosceles 
trapezoid whose bases are HH’ = h, KK’ = 2h (Fig. 13.8). Denote the length of the 
diagonals K’H = KH’ by d. o 


Theorem 13.5 An isoceles trapezoid can be circumscribed by a circle. 
Proof The theorem follows immediately from Thms. A.15 and A.16. m 
Theorem 13.6 For d, b, h shown in Fig. 13.8, Ê =b +2. 


Proof The theorem follows from Ptolemy’s theorem (Thm. A.18) which says that 
in a quadrilateral circumscribed by a circle the product of the diagonals equals the 
sum of the products of the opposite sides. o 


ov 


K’ 


Fig. 13.7 Construction of an isoceles trapezoid (2) 
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K’ 
Fig. 13.8 Construction of an isoceles trapezoid (3) 


The proof of Thm. 13.3 can now be given. 


Proof Let X be the point on line PQ that extends PQ by b. (We will eventually 
construct X.) Define x = K’X. From Thm. 13.6: 


d? =b 42h? = (x? - h) 42h =x? +97. 


Since AQK’X is aright triangle x? = b? + h? (Fig. 13.9). 


+ 


Fig. 13.9 Application of Ptolemy’s theorem 
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Construct S as the intersection of c(K, d), c(K’, d) (Fig. 13.10). AOSK’ is aright 
triangle so by Pythagoras’s Theorem OS” = d? — k? = x? and OS =x. 


$v 


Fig. 13.10 Construction of the point for addition and subtraction (1) 


Construct X as the intersection of c(K, x), c(K’,x) (Fig. 13.11). Since the length 
of OX is Vx? — h? = b the length of PX is a + b and the length of PX’ is a- b. o 


OTA 


K’ 


Fig. 13.11 Construction of the point for addition and subtraction (2) 
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13.5 Construction of a Line Segment as a Ratio of Segments 


Theorem 13.7 Given line segments of length n,m, s, it is possible to construct a line 
segment of length: . 
x=—S. 
m 
Proof Construct two concentric circles cy = c(Z, m) and c2 = c(Z, n),! and choose 
an arbitrary point A on cy. By Thm. 13.2 construct a chord AB of length s on cı 
(Fig. 13.12a). If AB intersects c2, by Thm. 13.3 multiply m,n by a number k so that 


the chord does not intersect the circle. Note that this does not change the value that 


. . kn n 
we are trying to construct since x = — s = — 8. 
km m 


Choose a point H on c and denote the length of AH by w. Construct K on cz 
such that the length of BK is w (Fig. 13.12b). AAHZ = ABZK by side-side-side 
since ZA = ZB = mare the radii of the same circle, as are ZH = ZK =n, and 
AH = BK = w by construction (Fig. 13.13a). From AAHZ = ABZK it follows 
ZAZH = ZBZK and then ZAZB = LHZK. It is difficult to see this equality from 
the diagram, but Fig. 13.13b should clarify the relation among the angles. 

AZAB ~ AZHK since both are isosceles triangles and we have shown that they 
have the same vertex angle. Label HK by x. Then: 


S 


Cl Ci 


ANa AN 


s B S B 


Fig. 13.12a Construction of x = 7s, step 1 Fig. 13.12b Construction of x = 77s, step 2 


1 We assume that m > n; if not, exchange the notation. 
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cl 


—— 


B 


Fig. 13.13a Construction of x = 7s, step 3 Fig. 13.13b ZAZB = LHZK 


13.6 Construction of the Intersection of Two Lines 


Theorem 13.8 Given two lines containing the line segments AB, CD, it is possible 
to construct their intersection S. 


Proof Let C’, D’ be the reflections of C, D around AB. There are two cases de- 
pending on whether C, D lie on the same side of AB or on different sides. Label 
x = CS,c = CC’,d = DD’,e = CD as shown in Figs. 13.14, 13.15. We compute 
the value of x for each case. 

Case I: C,D are on the different sides of AB. S lies on AB because ACZS = 
AC’ ZS by side-angle-side: CZ = ŒZ, ¿CZS = LC’ZS = 90° and ZS is a common 
side. Te C’S = CS and similarly D’S = DS. ACSC’ ~ ADSD’ are similar 
so 


= Ê and solving the equation gives x = : 
e-x ae 


Fig. 13.14 Construction of the intersection of two lines (1) 
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we 


Cc’ 


Fig. 13.15 Construction of the intersection of two lines (2) 


Case 2: C, D are on the same side of AB. ACSC’ ~ ADSD’ gives 2 È 7 
x-e 


and solving the equation gives x = =a 
c- 


Construct the circles c(C’,d),c(D,e) and denote their intersection by H 
(Fig. 13.16). The sum of the line segments CC”, C'H is c + d. We have to show 
that H is on the extension of CC’ so that CH is a line segment of length c + d. 
CH =c —d incase D is on the same side of AB as C (not shown in the diagram). 

H is the intersection of c(C’, d), c(D, e) so DH = e, C'H = d. By construction 
C’D’ = e, D’D = d so the quadrilateral C’D’ DH is a parallelogram. 


Fig. 13.16 Construction of the intersection of two lines (3) 
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Fig. 13.17 Construction of the intersection of two lines (4) 


By construction DD’ || CC’ so C’H || DD’ and therefore C’H || CC’. Since one 
of its end points is C’ it must be on the line containing CC’. By Thm. 13.3, from the 
lengths c, d, e a line segment of length c + d can be constructed and by Thm. 13.7 a 


A c : x 
line segment of length x = TJE can be constructed. S, the intersection of c(C’, x) 
c 


and c(C, x), is also the intersection of AB,CD (Fig. 13.17). o 


13.7 Construction of the Intersection of a Line and a Circle 


Theorem 13.9 Given a circle k = C(M,r) anda line l it is possible to construct the 
intersections of k and l. 


Proof Construct M’, be the reflection of M around / and construct the circle k’ = 
c(M’,r). Since MYM’ = AMXM'’, X,Y, the points of intersection of k, k’, are the 
points of intersection of / and k (Fig. 13.18). 

This construction cannot be done if M is on the line Z. In that case choose an 
arbitrary point A on / that is at a distance more than r from M. Using Thm 13.3 
shorten and lengthen AM by r. X,Y, the endpoints of these segments, are the 
intersections of k and / (Fig. 13.19). oO 
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Fig. 13.18 Construction of the intersection of a line and a circle (1) 


What Is the Surprise? 


When one learns about constructions with a straightedge and compass it is obvious 
that both tools are necessary. Therefore, it was quite a surprise to find out that 
a compass is sufficient. The proof is quite long so we are not going to leave the 
straightedge at home, but the theorem shows that we should not assume that there 
are no alternatives to well-known mathematical concepts. 


Sources 


This chapter is based on problem 33 of [13] reworked by Michael Woltermann [14]. 
An additional proof can be found in [25]. 


Fig. 13.19 Construction of the intersection of a line and a circle (2) 
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Chapter 14 ag 
A Straightedge and One Circle is Sufficient 


Can every construction with a straightedge and compass be done with only a straight- 
edge? The answer is no because lines are defined by linear equations and cannot 
represent circles which are defined by quadratic equations. In 1822 Jean-Victor Pon- 
celet conjectured that a straightedge only is sufficient provided that one circle exists 
in the plane. This was proved in 1833 by Jakob Steiner. 

After explaining in Sect. 14.1 what is meant by performing a construction with 
only a straightedge and one circle, the proof is presented in stages starting with five 
auxiliary constructions: construction of a line parallel to a given line (Sect. 14.2), 
construction of a perpendicular to a given line (Sect. 14.3), copying a line segment 
in a given direction (Sect. 14.4), construction of a line segment as a ratio of segments 
(Sect. 14.5) and construction of a square root (Sect. 14.6). Section 14.7 shows how 
to find the intersection(s) of a line with a circle and Sect. 14.8 shows how to find the 
intersection(s) of two circles. 


14.1 What Is a Construction With Only a Straightedge? 


A construction using a straightedge and compass is a sequence of three operations: 


e Find the point of intersection of two lines. 
e Find the point(s) of intersection of a line and a circle. 
e Find the point(s) of intersection of two circles. 


The first operation can be performed with a straightedge only. 

A circle is defined by a point O, its center, and by a radius r, a line segment of 
length r one of whose endpoints is the center. If we can construct the points labeled 
X and Y in Fig. 14.1a we can claim to have successfully constructed the points of 
intersection of a given circle with a given line. Similarly, the construction of X, Y in 
Fig. 14.1b is the construction of the points of intersection of two given circles. The 
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Fig. 14.la X,Y are the points of intersection Fig. 14.1b X,Y are the points of intersec- 
of a line and a circle tions of two circles 


circles drawn with dashed lines in a diagram do not actually appear in a construction; 
they are just used to help understand the construction. 

The single given circle used in the constructions, called the fixed circle, can appear 
anywhere in the plane and can have an arbitrary radius. 


14.2 Construction of a Line Parallel to a Given Line 


Theorem 14.1 Given a line | defined by two points A, B and a point P not on the 
line, it is possible to construct a line through P that is parallel to AB. 


Proof There are two cases to the proof. 
Case 1: AB is a directed line segment if the midpoint M of AB is given. Construct 
a ray that extends AP and choose any point S on the ray beyond P. Construct the 
lines BP, SM, SB. The intersection of BP and SM is denoted O. Construct a ray that 
extends AO and denote by Q the intersection of the ray AO with SB (Fig. 14.2). 
We claim that PQ || AB. 


A M B 


Fig. 14.2 Construction of a parallel line in the case of a directed line 
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The proof uses Ceva’s theorem. 

Ceva’s theorem (Thm. A.19): If the line segments from the vertices of a triangle to 
the opposite edges intersect in a point O (as in Fig. 14.2), the lengths of the segments 
satisfy: 


— AM 
In Fig. 14.2 M is the midpoint of AB so — = 1 and the equation becomes: 
MB 


BQ PA AP 
Se, (14.1) 
QS SP PS 
since the order of the endpoints of a line segment is not important. 
We claim that AABS ~ APQS: 


BS _BO OS _ BQ, 
OS OS QS OS 
AS _AP PS _ AP 


= = +1. 
PS PS PS PS 
Using Eq. 14.1: 
BS BQ AP AP PS AS 
— = SS tS St ls FS Se 
QS QS PS PS PS PS 


and it follows that AABS ~ APQS and therefore PQ || AB. 

Case 2: AB is not necessarily a directed line segment. The fixed circle c has 
center O and radius r. P is the point not on the line through which it is required to 
construct a line parallel to / (Fig. 14.3a). 

Choose M, any point on /, and construct a ray extending MO that intersects 
the circle at U, V. UV is a directed line segment because O, the center of the circle, 
bisects the diameter UV. Choose a point A on / and use the construction for a directed 
line segment (Case 1) to construct a line through A parallel to UV which intersects 
the circle at X,Y (Fig. 14.3b). 

Construct a diameter from X through O that intersects the other side of the circle 
at X’, and similarly construct the diameter YY’. Construct the ray from X’ through 
Y’ and denote by B its intersection with J. We claim that M is the bisector of AB so 
that AB is a directed line segment and therefore a line can be constructed through P 
parallel to / (Fig. 14.4). 
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M A M 


Fig. 14.3a Construction of a directed line Fig. 14.3b Construction of a line parallel to 
the directed line 


OX,OX’', OY, OY’ are all radii of the circle and ZXOY = ZX’OY’ since they 
are vertical angles, so AXOY = AX’OY’ by side-angle-side. Define! l’ to be a line 
through O parallel to / that intersects XY at Z and X’Y’ at Z’. XOZ = /X'OZ' are 
vertical angles, /ZXO = £Z’X’O are alternate interior angles and XO = XO’ are 
radii, so AXOZ = AX’OZ’ by angle-side-angle and ZO = OZ’. Therefore, AMOZ 
and BMOZ’ are parallelograms and AM = ZO = OZ’ = MB. oO 


Theorem 14.2 Given a line segment AB and a point P not on the line, it is possible 
to construct a line segment PQ that is parallel to AB and whose length is equal to 


the length of AB, that is, it is possible to copy AB parallel to itself with P as one of 
its endpoints. 


Fig. 14.4 Proof that l’ is parallel to / 


1 Define, not construct, because we are in the middle of the proof that such a line can be constructed. 
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A B 


P Q 


Fig. 14.5 Construction of a copy of a line parallel to an existing line 


Proof We have proved that it is possible to construct a line m through P parallel 
to AB and a line n through B to parallel to AP. The quadrilateral ABOP is a 
parallelogram so opposite sides are equal AB = PQ (Fig. 14.5). oO 


14.3 Construction of a Perpendicular to a Given Line 


Theorem 14.3 Given a line segment l and a point P not on l, it is possible to 
construct a perpendicular to | through P. 


Proof By Thm. 14.1 construct a line /’ parallel to / that intersects the fixed circle at 
U, V. Construct the diameter VOU’ and the chord VU’ (Fig. 14.6). ZUVU’ is a right 
angle because it is subtended by a diameter. Therefore VU’ is perpendicular to UV 


and /. Again by Thm. 14.1 construct the parallel to VU’ through P. oO 
U’ 
$P 
E 
U V 
l 


Fig. 14.6 Construction of a perpendicular line 
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Fig. 14.7 Copying a line segment in a given direction 


14.4 Copying a Line Segment in a Given Direction 


Theorem 14.4 It is possible to construct a copy of a given line segment in the 
direction of another line. 


The meaning of “direction” is that the line defined by two points A’, H’ is at an 
angle 0 relative to some axis and the goal is to construct AS = PQ such that AS will 
have the same angle @ relative to that axis (Fig. 14.7). 


Proof By Thm. 14.1 it is possible to construct a line segment AH such that AH || 
A'R’, and to construct a line segment AK such that AK || PQ. ZHAK = @ so it 
remains to find a point S on AH so that AS = PQ. 

Construct two radii OU, OV of the fixed circle which are parallel to AH, AK, 
respectively, and construct a ray through K parallel to UV. Denote its intersection 
with AH by S (Fig. 14.8). By construction, AH || OU and AK || OV, so ZSAK = 
ZHAK = ¿UOV = 9. SK || UV and ASAK ~ AUOV by angle-angle-angle, 
AUOV is isosceles because OU, OV are radii of the same circle. Therefore, ASAK 
is isosceles and AS = AK = PQ. oO 


Ly CL 


Fig. 14.8 Using the fixed circle to copy the line segment 
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Fig. 14.9 Similar triangles to construct the ratio of lengths 


14.5 Construction of a Line Segment as a Ratio of Segments 


Theorem 14.5 Given line segments of lengths n,m, s, it is possible to construct a 
line segment of length: : 
x= —S: 

m 
Proof Choose points A, B,C not on the same line and construct rays AB, AC. By 
Thm. 14.4 it is possible to construct points M, N, S such that AM = m, AN = n, 
AS = s. By Thm. 14.1 construct a line through N parallel to MS which intersects 
AC aa and lapel AX by x (Fig. 14.9). AMAS ~ ANAX by angle-angle-angle so 


m 
—=-—andx=~—s. o 
n x m 


14.6 Construction of a Square Root 


Theorem 14.6 Given line segments of lengths a, b, it is possible to construct a line 
segment of length Vab. 


n , 
Proof We want to express x = Vab as x = —s in order to use Thm. 14.5. 
m 


e Forn we use d, the diameter of the fixed circle. 
e Form we use t = a+b which can be constructed from a, b by Thm. 14.4. 
e We define s = Vhk where h, k are defined as expressions on the lengths a, b, t, d. 


d d 
Define h = ae and k = 7? and then compute: 


th tk t \2 t t 
xa vab= 2% = (4) a ae 
nik- ne pot GtD 4 Wy, 

t t t t 
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Fig. 14.10 Construction of a square root 


By Thm. 14.4 construct HA = h on a diameter HK of the fixed circle. From 
h + k = d we have AK = k (Fig. 14.10). By Thm. 14.3 construct a perpendicular to 
HK at A and denote the intersection of this line with the circle by S.OS = OK = d/2 
and OA = (d/2) — k. 

By Pythagoras’s Theorem: 


2 2 
d d 
2 
=|=] -(~-k 
*=(3) -= 
d\" ? dk 
= aA eee 
2 2 2 
=k(d-k)=kh 
s= Vhk. 
t 
Now x = 7° can be constructed by Thm. 14.5. o 


14.7 Construction of the Intersection of a Line and a Circle 


Theorem 14.7 Given a line l and a circle c(O,r), it is possible to construct their 
points of intersection (Fig. 14.11). 


Proof By Thm. 14.3 it is possible to construct a perpendicular from the center of the 
circle O to the line /. The intersection of / with the perpendicular is denoted by M. 
OM bisects the chord XY, where X,Y are the intersections of the line with the circle 
(Fig. 14.12). Define XY = 2s and OM = t. Note that s, X,Y are just definitions not 
entities have been constructed. 
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Fig. 14.11 Construction of the points of intersection of a line and a circle (1) 


By Pythagoras’s Theorem s? = r? — 1? = (r +t)(r —t). By Thm. 14.4 it is possible 
to construct line segments of length t from O in the two directions OR and RO. The 
result is two line segments of length r + t,r — t. 

By Thm. 14.6 a line segment of length s = y(r + t)(r — t) can be constructed, 
and by Thm. 14.4 line segments of length s from M along / in both directions can 
be constructed. Their other endpoints are the points of intersection of / and c. oO 


Fig. 14.12 Construction of the points of intersection of a line and a circle (2) 
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Fig. 14.13 Construction of the intersection of two circles (1) 


14.8 Construction of the Intersection of Two Circles 


Theorem 14.8 Given two circles c(01, r1), c(O2, r2), it is possible to construct their 
points of intersection. 


Proof Construct O103 and label its length t (Fig. 14.13). Label by A be the point 
of intersection of O;O and XY, and label q = O,A, x = XA (Fig. 14.14). A has 
not yet been constructed, but if g,x are constructed then by Thm. 14.4 the point A 
at length g from O in the direction 0102 can be constructed. 

Once A has been constructed, by Thm. 14.3 a perpendicular to 0,02 at A can 
be constructed, and by Thm. 14.4 it is possible to construct line segments of length 
x from A in both directions along the perpendicular. Their other endpoints are the 


points of intersection of the circles. 


Construction of the length q: Define d = 4 Jr? + ¢?, the hypotenuse of a right 
triangle, which can be constructed from the known lengths r1, t. Note that AO; XO2 
is not necessarily a right triangle; the right triangle can be constructed anywhere in 
the plane. In the right triangle AX AO,, cos 4XO,A = q/r\. By the Law of Cosines 
for AXO,O2: 


Fig. 14.14 Construction of the intersection of two circles (2) 
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r2 = Ptr? — 2rıt cos LXO102 
= Pert — 2tq 
2tq = (Ptr) -r=d-r 
_ (d+12)(d~12) 
E 2t ` 


By Thm. 14.4 these lengths can be constructed and by Thm. 14.5 q can be constructed 
from d + r2, d — r2, 2t. 


Construction of the length x: By Pythagoras’s Theorem: 


x=qri-@P=v(nita(ri-4@)- 


By Thm. 14.4, h = ri +q,k = rı — q can be constructed, as can x = Vhk by 
Thm. 14.6. o 


What Is the Surprise? 


A compass is necessary because a straightedge can only compute the roots of linear 
equations and not values such as V2, the hypotenuse of an isoceles right-triangle 
with sides of length 1. However, it is surprising that the existence of only one circle, 
regardless of the position of its center and the length of its radius, is sufficient to 
perform any construction that is possible with a straightedge and compass. 


Sources 


This chapter is based on problem 34 of [13] reworked by Michael Woltermann [14]. 
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Check for 


Chapter 15 updates.) 


Are Triangles with Equal Areas and Perimeters 
Congruent? 


Are two triangles with the same area and the same perimeter congruent? Not neces- 
sarily: the triangles with sides (17,25,28) and (20,21,29) both have perimeter 70 
and area 210 but they are not congruent (Fig. 15.1).! This chapter shows that given 
a triangle with rational sides it is possible to construct a non-congruent triangle, 
also with rational sides, that has the same area and the perimeter. We carry out the 


derivation using an example, showing that the triangle with sides (3,4,5) and the 


156 101 41 


triangle with sides (5, ors u) both have perimeter 12 and area 6. 


15.1 From a Triangle to an Elliptic Curve 


The three angle bisectors in a triangle intersect in a point called the incenter of the 
triangle. The incenter is the center of a circle inscribed within the triangle (Fig. 15.2). 


20 2 
17 28 : 


81.20° 90° 
25 21 


Fig. 15.1 Non-congruent triangles with the same area and the same perimeter 


1 The areas were computed using Heron’s formula (Thm. A.3) and the angles using the Law of 
Cosines (Thm. A.8). 
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Fig. 15.2 A circle inscribed within a triangle 


Drop altitudes from the center O to the sides. The altitudes have length r, the 
radius of the inscribed circle. The altitudes and angle bisectors create three pairs of 
congruent right triangles: 


AAOB’ = AAOC’, ABOA’ = ABOC’, ACOA’ = ACOB’. 


The altitudes divide the sides a, b, c into segments u, v, w. The area of AABC is 
the sum of the areas of ABOC, AAOB, AAOC: 


1 1 1 
A= z ty)r+ z0 +tu)r+ z +w) (15.1a) 
1 
a5 -2(u+v+w)r (15.1b) 
1 
= ger beer (15.1c) 
= sr, (15.1d) 


where s is the semi-perimeter, one-half the perimeter of the triangle AABC. The 
lengths of u,v,w can be expressed using the radius of the circle and the central 
angles a/2, 8/2, y/2: 


a u v w 
tan= =—, tan-=-, tan5=—. (15.2) 
r r r 
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The semi-perimeter can now be expressed in terms of the tangents: 


ssutvtwe=rtanS +rtanÉ +rtan? =r (tan S + tan +tan 2) z 


and by Eq. 15.1d the area is: 


Az=sr=r’ [ian S tan +n 4 : (15.3) 
From r = A/s, Eq. 15.3 can be written as: 
A A 2 
tan Ž pe ear EA = = a ; (15.4) 
2 2 2 r? (A/s)? A 
Since the sum of the angles a, 8, y is 360°: 
y/2 = 360°/2 — (a/2 + B/2) (15.5a) 
tan y/2 = tan(180° — (@/2 + B/2)) (15.5b) 
= —tan(a/2 + 8/2) (15.5c) 
t 2+t 2 
X ana/2 + tan B/ l (15.50) 
tana/2 tan B/2 - 1 
using the formula for the tangent of the sum of two angles (Thm. A.9). 
Let us simplify the notation by defining variables for the tangents: 
x = tan yeunk, z= tan >. (15.6) 
By Eq. 15.5d we can express z = tan y/2 in terms of x, y: 
+ 
pn. (15.7) 
xy-1 
With this notation, Eq. 15.4 becomes: 
pe FED s? (15.8) 
x = —., y 
4 xy-1l A 
Given fixed values of A and s are there multiple solutions of Eq. 15.8? 
For the right triangle (3, 4, 5): 
; 2 
s2 (18 +4+5)) 62 
Hend O G (15.9) 


1 
A 5°3-4 6 
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If there is another solution Eq. 15.8 with s? /A = 6, it can be written as: 


sty se (15.10a) 
xy-1 
xy +xy? —6xy+6=0. (15.10b) 


This is an equation for an elliptic curve. 


15.2 Solving the Equation for the Elliptic Curve 


A portion of the graph of Eq. 15.10b is shown Fig. 15.3. Any point on the closed 
curve in the first quadrant is a solution to the equation because the lengths of the 
sides of the triangle must be positive. A, B, D correspond to the triangle (3,4,5) as 
shown below. To find additional rational solutions the method of two secants is used. 

Construct a secant through the points A = (2,3), B = (1, 2). It intersects the curve 
at C = (—1.5, —0.5), but this does not give a solution because the values are negative. 
Construct a second secant from C to D = (3,2). The intersection with the curve at 
E x (1.5, 1.2) does give a new solution whose coordinates will be computed below. 


(3,2) 


Fig. 15.3 The method of two secants 
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The equation of the (red) line through A, B is y = x + 1. From Eq. 15.10b: 


x(x +1) +x(x +1)? -— 6x(x +1) +6=0 


2x? — 3x° -5x+6=0. 
From A, B we know two roots x = 2,x = 1 so we can factor the cubic polynomial: 
(x — 2)(x - 1)(ax+ b) =0, 


where the third root is unknown. Multiply the factors and conclude that a = 2, b = 3 
since 2x3 — 3x? — 5x +6 = ax>+---+2b. The third factor is 2x + 3 which gives the 


third root x = -3 and y=x+1= -4. This is the point C = (-3, -4) in the graph. 
The equation of the (blue) line through C, D is: 
5 1 
e 0b = 15.11 
a ae (15.11) 


Substitute for y in Eq. 15.10b: 


From C, D we know two roots x = 3,x = -3 so we can factor the cubic polynomial: 


(x-3) (5) (ax+b)=0. 


Equating the coefficients of the cubic term and the constant terms gives: 


70 4 
m3 
54 
= — ~ 1.543, 
a 35 


and y can be computed from Eq. 15.11: 


25 
= x 1.190. 
%57 


The coordinates of E are: 


54 25 
35°21 


) = (1.543, 1.190) , 


which are close to the approximations (1.5, 1.2) obtained from the graph. 
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Finally, compute z from Eq. 15.7: 


x+y E 5) | (32 )-2e 8 


z= ay N85 21) N50 fp 65 IS 


15.3 Derivation of a Triangle From the Elliptic Curve 


Using Eqs. 15.2, 15.6, a,b,c, the sides of the triangle AABC, can be computed 
from x, y,z andr = A/s = 6/6=1: 


a=wtve=r(zt+y)=(z+y) 
b=ut+wer(xt+z) =(x+2Z) 


c=u+v=r(x+y)=(x+y). 
For solution A of the elliptic curve the sides of the triangle are: 


a=z+y=1+3=4 
b=x+z72=24+1=3 
c=x+y=24+3=5. 


For solution E of the elliptic curve the sides of the triangle are: 


deba a Le 
Sr he. og 35 

54 49 101 
b=x+z= — + — = 

35 15 21 
E ee 
Be a T5 


Let us check this result. The semi-perimeter is: 


_1(156 101 41) 1 (468 +505+287) 1/1260 
s=7 105 105 


2 


35 2 I5 


> 


~ 2 


and the area can be computed using Heron’s formula (Thm. A.3): 


a= e[s- he- a} (6-5) = 8-6. 


Is (5, HL, 4 = (3,4,5)? To simplify the computation let us use the decimal 


approximations (4.48, 4.81, 2.73). Then: 
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4.48 


Fig. 15.4 The triangle with the same perimeter and area as (3, 4, 5) 


V4.48? + 2.73? = 5.25 + 4.81, 


so this is not a right triangle and not congruent to (3, 4,5). 
The Law of Cosines can be used to compute the angles of the triangle as shown 
in Fig. 15.4. 


What Is the Surprise? 


Are triangles with the same area and perimeter congruent? My first impression 
was to say “yes” because it is not easy to find counterexamples. What is surprising 
is that given an arbitrary triangle with rational sides, it is possible to construct a 


non-congruent triangle with rational sides which has the same area and perimeter, 


156 101 u) 


although the result can be strange as with the triangles (3,4, 5) and (5, 31 15 


Sources 


This chapter is based on [33]. In [3] it is shown that given an isoceles triangle there 
are non-congruent triangles with the same area and perimeter, but the proof does not 
include an explicit construction. 
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Chapter 16 yates | 
Construction of a Regular Heptadecagon 


The only regular polygons that the Greeks knew how to construct with a straightedge 
and compass were the triangle, the square, the pentagon and the regular polygon 
with 15 sides. Given a regular polygon with n sides, a polygon with 2n sides can 
be constructed by circumscribing the polygon with a circle and bisecting the central 
angle (Fig. 16.1). No further progress was made until 1796 when Carl Friedrich Gauss 
awoke one morning, just before his 19th birthday, and by “concentrated thought” 
figured out how to construct a regular heptadecagon, a regular polygon with 17 sides. 
This achievement inspired him to become a mathematician. 

Section 16.1 discusses the relation between the side of a polygon inscribed in a 
circle and the central angle that it subtends. Section 16.2 states without proof the 
Fundamental Theorem of Algebra. Section 16.3 presents the roots of unity, the roots 
of the polynomial x” — 1, which are central to Gauss’s proof. Sections 16.4 and 16.5 
present Gauss’s proof which is based on symmetries of roots of polynomials. Gauss 
derived a formula proving that the heptadecagon is constructible, but a geometric 
construction was not given for almost a century. Section 16.6 gives an elegant 
construction by James J. Callagy. Section 16.7 shows how constructions of a regular 
pentagon can be derived using both geometry and trigonometry. 

Some of the material is more straightforward if presented using complex numbers. 
This material is set off in boxes that can be skipped. 


dips 
As 


Fig. 16.1 Constructing a regular polynomial with 10 sides from a regular pentagon 
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16.1 Construction of Regular Polygons 


The construction of the regular heptadecagon led to the Gauss-Wantzel theorem, 
which states that a regular polygon with n sides can be constructed with a straightedge 
and compass if and only if n is the product of a power of 2 and zero or more distinct 
Fermat numbers 22" + 1 which are prime. The known Fermat primes are: 


Fo=3, Fp =5, Fo=17, F3=257, Fy = 65537. 


A regular polygon with 257 sides was constructed by Magnus Georg Paucker in 1822 
and by Friedrich Julius Richelot 1832. In 1894 Johann Gustav Hermes claimed to 
have constructed a regular polygon with 65537 sides. 

To construct a regular polygon it is sufficient to construct a line segment of 
length cos 9, where 6 is the central angle subtended by a chord that is a side of the 
polygon inscribed in a unit circle. Given the line segment OB = cos 0, construct a 
perpendicular at B and label its intersection with the unit circle by C. Then: 


ae 
cos @ = OF =OB 
OC 
6 = cos™! (OB) . 


The chord AC is a side of the regular polygon (Fig. 16.2). 

Given a line segment defined to have length 1, the lengths that are constructible 
are those which can be obtained from line segments of known length using the 
operations {+, —, x, /, y} (Sect. 2.5). Gauss showed that cos(360°/17), the cosine of 
the central angle of a heptadecagon, is constructible since it can be expressed using 
only these operations: 


360° 1 1 1 | 
cos ate 34-2V17+ 
1 | | 
Liai — V¥34 — 2V 17 — 2434 + 2V17. 


cos 0 B 


Fig. 16.2 The cosine of the central angle of a regular polygon 
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16.2 The Fundamental Theorem of Algebra 


The following theorem will be used without proof. 
Theorem 16.1 Every polynomial of degree n has exactly n roots. 


The statement of the theorem has been simplified because all we will need to 
know is that n roots exist. 


The Fundamental Theorem of Algebra states that every non-constant 
polynomial of degree n in a single variable with complex coefficients has 
exactly n complex roots. If there are multiple roots with the same value, they 
are all counted: x? — 4x + 4 = (x — 2)(x — 2) has two roots both equal to 
2. The polynomial x? + 1 with integer coefficients has two complex roots 
+V-1. Strangely, even though the theorem is about finite algebraic entities— 
polynomials of degree n with n roots—methods of analysis, usually complex 
analysis, are needed to prove the theorem. 


16.3 Roots of Unity 


By the Fundamental Theorem of Algebra (Thm. 16.1) the polynomial x” — 1 has n 
roots for any integer n > 1. One root is x = 1 so there are n — 1 other roots. Denote 
one of these roots by r. Since r” = 1 it is called an n-th root of unity. What about r?°? 


(r?)” = (r)? = 12 =i 


It follows that the n numbers: 


are n-th roots of unity. 


Let r = cos (2) +isin (2). By de Moivre’s formula: 


n n 


(=) _ 6i (=) ous (=| 
cos | — | +i sin | — = cos | — | +isin | — | = 1. 
n n n n 
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Theorem 16.2 Let n be a prime number and let r an n-th root of unity. Then: 


{1, r,r, . Oe cata 


are distinct so they are all the n-th roots of unity. 


Proof Suppose that the powers are not distinct so that rê = r/ for some 0 < i < 
j < n— 1. Then r//r' = r/~' = 1 so there exists at least one positive integer i’ less 
than n such that r” = 1. Let m be the smallest such positive integer. By the division 
algorithm for integers n = ml + k for some 0 < I < nand O < k < m. From: 


1 =r" =r"Hk = (r). rk =1!. r5 =r" 


>’ 


we have 0 < k < m and r* = 1. Since m was defined to be the smallest such positive 


integer k = 0 and n = ml is not prime. o 
Theorem 16.3 Let {a1,a2,..., An-1, an} be the roots of an n-th degree polynomial 
f(x). Then: 

f(x) = (x - ai) (x — a2) -++ (x — an-1)(x — an). (16.1) 


Proof If a; is a root of f(x) by definition f(a;) = 0 but: 


f(ai) = (ai — a1) (ai — a2) +++ (ai — an-1) (ai — an) 
taper =0. 


Therefore, f(x) = (x — a;)g;(x) for some g;(x) and by induction this holds for all 
the roots. o 


From Eq. 16.1 it is easy to see that the coefficient of x”~! is: 
—(a, +a2+:::+4an-1+4án). 
Since the coefficient of x”~! in x” — 1 for n > 2 is zero, we have: 


=(1+r +r? + +r"? 47} =0 


rete tr +r =-1, 
For the heptadecagon this is: 


3 6 


+r) 4784 


ae cage r!l +r? rbr! 


rtererPtrerstr 


447154716 - 1. (16.2) 
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16.4 Gauss’s Proof That a Heptadecagon Is Constructible 


What Gauss understood is that one need not work with the roots in their natural order 


r,r2,...,r'®, The powers of r? give all the roots but in a different order: 


rl, r=, p39. r? 3=27=10 | r 10-3=30=13 | r 13:353955 | r53=15 p15:3=45=11 


y113=33=16 | p16-3=48=14 | p143=42=8 | BSH 2AS 1. p = rre, y12:3=36=2 2:3=6 


r r , 


where the roots have been reduced modulo 17: 


plimtk = (eye i rk =1”. rk = rk f 


Check that the list contains all the roots (except 1) exactly once: 


ri, r’, r’, yf. r, rr, r!l, poe: r4, rê, r', rÍ, r12, r’,r® : (16.3) 


Given a monic quadratic polynomial whose roots are a, b: 
y+ py+q=(y—a)(y—b) =0, 
we can compute the coefficients p, q from the roots (Chap. 7): 
p=-(at+b), q=ab. 


Therefore, given a + b and ab we can write down the quadratic equation of which 
a, b are the roots. 
Let do be the sum of the roots in the odd positions in Eq. 16.3: 


ag=rer er? er? er +P ar er’, 


and let a; be the sum of the roots in the even positions in Eq. 16.3: 


aisr trL +r’ r! +r +r erie, 
To obtain ao, a; as roots of a quadratic equation first compute their sum and use 
Eq. 16.2: 
aota =r+r + +r =-1, 


Now we have to work very hard to compute their product. Figure 16.3 shows the 
computation where the values of r'r/ = r'*/ are written after reducing the exponents 
modulo 17. Check that each root occurs exactly four times so that—again using 
Eq. 16.2—the value of the product is —4. 
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1 1 1 1 1 1 1 1 
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4 4 4 4 4 4 4 4 
=-4. 


Fig. 16.3 Computation of aja); below each root is the number of occurrences of the root so far 


Since ag + a; = —1 and doa; = —4, a1, a2 are the roots of the quadratic equation 
y? +y —4 = 0 and they can be computed using the simple formula for the roots of a 


-1 + V17 


2 
Now, let bo, b1, b2, b3 be the sums of every fourth root starting from r! r? r? r0, 


quadratic equation: 


40,1 = 


respectively: 
bo =r! +r +r!6 +r’ 


3 4 


bi=r +r +r’ +r” 


bo rer aero tr? 


bz =r} 4rar sr. 


Check that bo + b2 = ao, bı + b3 = a, and compute the corresponding products: 


bob2 = (r+ 734 lO 4 r^) x (r? trP4r8 4 r°) 


7 (7104716 479 +r?) + (rô +r!!! +rf+r")+ 
(rè +r! 47747!) + (rB +r? +r? +r) 


=al. 
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tg r1?) x (r!? +r! +r + rô) 


6 


bibs = (+P +r! 
= (rB +r! +r! +r’) + (r5 +r! +r! +r!) + 


(r! +r? +r’ +r’) + (r° +r +r? +r!) 


=-l. 


To summarize these computations: 


bo + b2 = ag 
a = -1 

bi +b3 =a] 
bib, =-1, 


so bo, b2 are the solutions of y? —aoy — 1 = 0, and bj, b3 are the solutions of 
y? — ayy — 1 = 0. Using the values previously computed for ay, a; we can compute 
the roots bo, bı (Fig. 16.4). 


Finally, let co, c4 be the sums of every eighth root starting with r!, r!?: 


co=r!+r!6 
ca=rP +r 
16 


co+c4 =r! +r +r} +r = bo 
coca = (r! +r!6) - (rB +r’) 


=r4e Par? =b, 
so Co, c4 are the roots of y? — boy + bı = 0. Since cos(360°/17) = co/2 (Fig. 16.5) 
it suffices to compute the root co = r! + r!6 (Fig. 16.6). 


The cosine of the central angle of a heptadecagon is constructible with a straight- 
edge and compass since it is composed only of rational numbers and the operations 


{+,-,x,/, V}: 


360°\ _ co 
cos | 7 = (16.4) 


i. 4 1 
Te eT ese (16.5) 


a 68 + 12V17+2(-1+ Vit) V34 - 2V17 - 16.344 2VI7. 
(16.6) 
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bo = 7 
Civ [cie vD\ 
2 2 i 
= 2 
2 
1+ VI7) + /( 1+VI7) 16 
= 4 
_ (14 V17) + V34 = 2V17 
m 4 
7 Aas 
2 
(-1-vi) , |(Gi-vm)", 
2 2 i 


(-1-Vi7) +y (-1 Vi) +16 
4 


_ (=1-— VIT) + V34 +2V17 
= 7 . 


Fig. 16.4 Computation of bo and b; 


16x ieee 2-16 
j 17 
—2r 


16 


Fig. 16.5 The cosine of the central angle computed from r1, r16 
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bo + qb? n3 4b, 


co = 


2 
_ 1 (=1+ V17) + ¥34=2VI7 | 
=5 7 H 


1 A [= r] 
2\ 4 4 


= - 5+ VI+ 5 V34-2VI7 


8 
a sp, rei 1 VID + 34+ 2V7) 


IVT- 34 — 2V17 + 
1+ V17)? +2(-14 V17)-V34 — 2V17 + (34 - 2V17) 
6 16 — 16V17) a 


= IVT- 34 — 2 V17 4 


+ 12V17+2(-14 Vi) ¥34 — NT - 1634 + 2VT7 


Fig. 16.6 Computation of co 


16.5 Derivation of Gauss’s Formula 


The above formula for cos(360°/17) is not the one given by Gauss. Here is a 
derivation of Gauss’s formula: 


Let us simplify 2(-1 + V17) V34 — 2V17: 


2(—1 + VIT) 34 — 2V17 = -2434 — 2V17 + 2VT7 34 - 2VI7 
+4434- VTF - 44/34 - 2VT7 
= 2434 — 2 V17 + 2V17434 - 2V17 


=2(1+ Vit) 34 - VTF -4434 -2V17. 
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We will remember the term —4-¥34 — 2V17 for now and simplify the first term by 
squaring it and then taking the square root: 


2 
2(1 + V17)¥34 - 2V17 = z (1+ VIT)y34-— 2vT7]| 


= ARE +2V17)(34 - 2V17) 


=24/(18-34-4- 17) + VIT(2 -34-2 - 18) 


=2. 4434+ 2V17. 


Substituting terms results in Gauss’s formula: 


cos (E) =- + VT+ E434- 2V17 
++ 68 + 12 V17 + 8434 + 2 V17 — 434 — 2V 17 — 164/34 + 2 V17 
=a +e VI+ eV -2Vi7 
p L174 3V7 - 4207 - 294s av. 


16.6 Construction of a Heptadecagon 


Construct a unit circle centered at O with perpendicular diameters QP and SR 
(Fig. 16.7). Construct A so that OA = (1/4)OR. 


Fig. 16.7 Construction of a heptadecagon (1) 
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By Pythagoras’s Theorem: 


AP = VOA +0P = VQ/42+ L =VI7/4. 


Let B be the intersection of the internal bisector of ZOAP and the line segment OP 
and let C be the intersection of the external bisector of ZOAP and the line segment 
QO. By the internal angle bisector theorem (Thm. A.13): 


OB AO 

BP AP 

OB 1/4 

1-08 Vi7/4 

T E. E. i-vi7 
1+ V17 14+V17 1-vV17 
-1 + V17 
16” 


and by the external angle bisector theorem (Thm. A. 14): 


OC AO 
CP AP 
oc _ 1/4 
1+0C Vi7/4 
aT: O. POE. 1+vVI7 
-1+ V17 -1+vV17 1+vV17 
_1+vV17 
16 ` 


Construct D on OP such that CD = CA = a (Fig. 16.8). By Pythagoras’s Theorem: 


CD = CA = VOA +00 
AG 1+ VI7 
4 


2 
7 1 
16 | = ives irsavn 
- 34 +2V17. 


16 
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A 
a 
b 
Q ° e P 
f MXC O| DB b E 
mM a — > 
f 
F 


Fig. 16.8 Construction of a heptadecagon (2) 


Construct E on OP such that BE = BA = b; again by Pythagoras’s Theorem: 


BE = BA = VOA +OB 


2, 
1\?7 [-14+ 17 1 


ll 

| 
eS) 
R 
l 
N 
$ 
~ 


Construct M as the midpoint of QD and construct F on OS such that MF = MQ = f: 


MF = MỌ = 50D = 5(OC + CD) = Ka - 0C) +CD) 


me Pe cms A re ec ee 
"2 16 16 


= gy (15 ~ V7 +-¥34+2Vi7] . 


Note that MO = 1- MQ =1- MF. 
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A 
P 
Q f MXC DB E e Ff 
ge 
f 
G 
F 


Fig. 16.9 Construction of a heptadecagon (3) 


Construct a semicircle whose diameter is OE. Construct a chord OG = OF = g 
(Fig. 16.9). By Pythagoras’s Theorem: 


OG =OF =\MF -MO =NŅMF -(1- MF) 


= V2MF -1 


= 5 (15- vIT+ Va442v07] -1 


= L-1- va 342v. 


ZOGE is aright angle since it is subtended by a diameter of the circle. Construct H 
on OP such that EH = EG = e; again by Pythagoras’s Theorem: 


EH = EG = VOE -0G =\(0B+BE)2-0G 


2 
= res z (-1- VT+ 34+ 277] 


16 16 16 


z Lose +2(-1 + VID 34 - 2VI7 + (34- 2V7) 
+ (16+ 16417- 16/34 +217] 
= Z 68 + 12 V17 - 16/34 +2Vi7 -2(1 ~ V7) 34- T7. 
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Compute OE: 


— — — ee 1 
OE =OB+B SENU J faoi 


= (v1 eva yaava). 


16 


Finally, OH = OE + EH which is Gauss’s formula for cos (360° / 17). 


16.7 Construction of a Regular Pentagon 


The complex fifth roots of unity are: 


E v5-1 | ,V10+2V5 V5 -1 | W10-2v5 
i 4 ` a - =? = 4 i 


16.7.1 Trigonometry 


The central angle of a regular pentagon is 360°/5 = 72°. Let us compute cos 36° 
using the trigonometric identities for 20 and 6/2 (Thms. A.2.1, A.7): 


0 = cos 90° = cos(72° + 18°) = cos 2 - 36° cos 36°/2 — sin 2 - 36° sin 36° /2 


1 ° _ o 
= (2cos? 36° — yee ~ 2 sin 36° cos 367) HESE l 


There is now only one angle in the formula; let x = cos 36°. Then: 
1 1- 
Q- W Ni xf 
(2x? - 1)V1 +x =2V1-x-Vl+x-x-V1-x 
2x? — 1 = 2x(1 — x) 
4x? -2x-1=0. 


Solving the quadratic equation gives a constructible value: 


cos 36° = J ae : 
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Fig. 16.10 Construction of a regular pentagon (1) 


16.7.2 Geometry 


Let ABCDE be aregular pentagon (Fig. 16.10). By definition all the sides and all the 
interior angles are equal. It is easy to show by congruent triangles that all diagonals 
are equal. Let the length of the sides be | and the length of the diagonals be x. 

AACE = ACAD by side-side-side so ZACE = ZCAD = 0. AAED = ACDE 
by side-side-side so ZADE = CED = ¢. ZAFC = ZEFD = y are vertical angles. 
In both triangles the sum of the angles is 180° so y + 26 = y + 2¢ and 0 = ¢. By 
alternate interior angles we conclude that AC || DE. 

Construct a line through E parallel to DC and let F be its intersection with AC 
(Fig. 16.11). CDEF is a rhombus so EF = CD = AE = 1. AACE is an isoceles 
triangle with base angles a. AAFF is also isoceles and ZAFE = ZFAE =a so 
AACE ~ AAEF. Taking ratios of the sides gives: 


2 


The result is a quadratic equation x“ — x — 1 = 0 whose positive root is constructible: 


1+ V5 
z ` 
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Fig. 16.11 Construction of a regular pentagon (2) 


What Is the Surprise? 


It is surprising that two millennia passed from the work of the Greeks on construction 
to the discovery by Gauss of the constructibility of the regular heptadecagon. It is 
also surprising that the problem was solved not by using geometry but by inventing 
new algebraic methods that had a far-reaching influence in mathematics. 


Sources 


This chapter is based on [6]. Gauss’s original work is available in an English trans- 
lation [18]. Equation 16.5-16.6 appears in [41]; the author assigns an exercise to 
transform it into Gauss’s formula as it appears in [18, p. 458] and [6, p. 68]. 

The construction of the heptadecagon is taken from [10] while other constructions 
can be found in [55]. The trigonometric construction of the regular pentagon is from 
[59]. The geometric construction of the regular pentagon was obtained by solving 
exercises 2.3.3 and 2.3.4 in [47]. 
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Appendix A 
Theorems From Geometry and Trigonometry 


This appendix presents theorems in geometry and trigonometry that may not be 
familiar to the reader, as well as theorems that may be familiar but whose proofs 
are not. Section A.1 presents three formulas for computing the area of a triangle. 
Section A.2 proves trigonometric identities. Although the formulas and identities 
are mostly familiar, students frequently learn these identities by heart or look them 
up without ever seeing a proof. The following sections contain proofs of advanced 
theorems in geometry: Sect. A.3—the angle bisector theorems, Sect. A.4d—Ptolemy’s 
theorem that relates the sides and diagonals in a quadrilateral circumscribed by a 
circle, Sect. A.5—Ceva’s theorem relating the three line segments of a triangle, and 
Sect. A.6—Menelaus’s theorem on the segments of a transversal in a triangle. 


A.1 Theorems About Triangles 
A.1.1 Computing the Area of a Triangle 


The standard formula for computing the area of a triangle from the base and the 
height is well-known. It can be proved using various geometric methods. 


Theorem A.1 The area of the triangle AABC is given by: 
1 
AABC = bh, (A.1) 
where b, the base, is one of the sides of the triangle, and h, the height, is the length 


of the altitude to b from the opposite vertex (Fig. A.1a). 


Proof Figure A.1b shows that by “cutting” the triangle at half the height, we can 
“move” the shaded triangles to form a rectangle of the same area as the triangle. The 
rectangle’s base is b and its height is h/2. o 
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B 
c a 
h=csin0 
6 
A C A C 
b 
Fig. A.la Computation of the area of a trian- Fig. A.1b Computation of the area of a trian- 
gle from the base and the height gle from the base and the height 


Theorem A.2 The area of the triangle AABC is given by: 
1 f 
AABC = Pe siné. (A.2) 


Proof From Thm. A.1 using h = c sin 0. o 


Theorem A.3 (Heron) The area of the triangle AABC is given by: 
AABC = ŅXs(s-a)(s—-b)(s-c), 
where s, the semi-perimeter of the triangle, is equal to F(a +b+c). 


Proof A radius of a circle and a tangent that intersects the radius are perdendicular. 
Furthermore, the lengths of the line segments of two tangents from the same point 
to the circle are equal. Therefore (Fig. A.2):! 


AAOB’ = AAOC’, ABOA’ = ABOC’, ACOA’ = ACOB’. 


The area AABC is the sum of the six triangles listed above. Since the height of six 
triangles is r, the radius of the inscribed circle, we obtain: 


AABC = AAOB’+AAOC'+ABOA'+ABOC’+ACOA'+ACOB’  (A.3a) 


1 
AABC = sr(ututvevews+w) (A.3b) 
1 
AABC = ara +b+c) (A.3c) 
AABC =rs. (A.3d) 


1 This shows that the incenter, the center of the inscribed circle, is the common intersection of the 
three angle bisectors. 
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Fig. A.2 Triangle with an inscribed circle 


Let us now define the sides in terms of the tangents of the central angles: 


From these definitions and s = 5(2u +2u + 2w) we get: 


ssutvewsr(tan Stan f+ tan 2) 
Since + 2+8 +8 4X4 X = 360° and thus % + Ê + % = 180°, by Thm. A.11: 
s=r a ee a” 
a a: 
l 
=r(*2Ħ) = Sw) 
rrr r 


uvw 


sS 
By Eq. A.3d: 
AABC =rs=s wee =Ysuvw. 


S 


Heron’s formula follows from u = s—a,v=s—b,w=s-c. o 
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A.2 Trigonometric Identities 
A.2.1 The Sine and Cosine of the Sum and Difference of Two Angles 


Theorem A.4 


sin(a@ + £) = sinacos £ + cos asin B 
sin(« — £) = sinacos 8 — cos « sin 8 
cos(a@ + 8) = cosa cos $ — sina sin B 


cos(@ — 8) = cos œ cos $ + sin ae sin f. 


We will prove the first formula; the other formulas can be obtained using the values 
of sine and cosine for —@ and 90° — a. 

Given a right triangle AABC with acute angle «œ and a right triangle AACD with 
acute angle 8, we can join them to obtain geometric figures with an angle a + 8 
(Fig. A.3). The left diagram is the one most often used in proofs of the identities. 
Here we give two proofs based on the center and right diagrams. 


D F a 
c 
g A 
Q Qa 
A B B C D A B 


Fig. A.3 Diagrams for proving the identity for the sine of sums of angles 


Proof (1) Let us compute the area of AABD in two different ways: (1) using Eq. A.2 
on AABD, and (2) using the equation separately on AABC and AADC (Fig. A.4). 
h is also computed twice using the definition of the trigonometric functions: 
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A 
a| B 
c h b 
B- o c m D 


Fig. A.4 Computation of the area of a triangle in two ways 


1 
AABD = zbe sin(a@ + £) 
AABD = AABC + AADC 


1 1 
= x<chsina + -bh sin 
2 2 
1 1 
= AG cos f) sina + z(e cos æ) sing. 
Equating the two formulas for AABD and canceling sbe, we get: 
sin(a + 8) = sin & cos f + cos a sin £. o 


The second proof uses the following theorem: 


Theorem A.5 In a circle of diameter 1 the length of a chord that subtends an 
inscribed angle is equal to the sine of the angle (Fig. A.5). 


Proof Let AB be a diameter and let BAC = a. Let D be any other point on the 
circle one of whose sides is the chord BC. Since equal chords subtend equal inscribed 
angles 4BDC = a. In the right triangle AABC: 


Fig. A.5 All inscribed angles subtended by a chord are equal 
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D 


NX c 


Fig. A.6 A quadrilateral circumscribed by a circle 


Proof (2) This proof is based on the right diagram in Fig. A.3 reproduced in Fig. A.6, 
where the quadrilateral ABCD has been inscribed in a circle. By Thm. A.15 a 
quadrilateral can be circumscribed by a circle if and only if the sum of each pair 
of opposite angles is 180°. ZADC + ZABC = 180° since both angles are right 
angles. From Thm. 5.4 the sum of the interior angles of a quadrilateral is 360°, so 
ZDAB + ZDCB = 180°. 

Let the diameter of the circle be 1 (otherwise, multiply everything by the length 
of the diameter). Then the sides of the quadrilateral are: 


BC =sina, CD =sinB, AB = siny, DA =sino, 
and their diagonals are: 
BD =sin(a+B), CA=sin(a+y). 


By Ptolemy’s Theorem (Thm. A.18) the product of the diagonals of a quadrilateral 
circumscribed by a circle is equal to the sum of the products of opposite sides of the 
quadrilateral. Since ZADC and ZABC are right angles we have: 


sin(a + £) sin(a + y) = sina sin ô + sin £ sin y 
sin(a + 8) sin(90°) = sin æ sin(90° — 8) + sin £ sin (90° — œ) 


sin(a@ + £) = sin æ cos $ + cos æ sin f. 
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A.2.2 The Cosine of a Triple Angle 
Theorem A.6 
cos 3a = 4cos? a — 3 cosa. 
Proof The proof uses the formulas in Thm. A.4 and the formula sin” œ + cos? a = 1: 


cos 3a = cos(2a + q) 
= cos 2a cos œ — sin 2a sin œ 
= (cos? a — sin? a) cos æ — (2 sina cos a) sin œ 


3 2 


= cos? q — cosa sin? a — 2 sin? a cos @) 


= cos? q — cos q + cos? a — 2 cos œ + 2 cos? a 


= 4cos* a -3 cose. o 


A.2.3 The Sine and Cosine of a Half-Angle 


Theorem A.7 If œ is an angle in a triangle then: 


[1+cosa@ 
2: 

À (2) 1-cosa 

sin | — =, ———_. 
2 2 


Q 
© 
a 
— 
NIR 
SS 
ll 


Proof The proof uses the formulas Thm. A.4 and the formula sin? a + cos? a = 1: 
2(5) = e08(5)e0s (5) -sia (5) sin (3) 
cos œ = cos2 | = | = cos | —] cos | = } — sin | —]} sin { = 
s 2 2 2 T ND 
=2cos? (=) -1 
2 
(2) 1+cosa 
cos |=] = 
2 2 
1+ 
sin? (Z) =1 - cos? (Z) =1- — 
i (2) 1 — cos œ = 
sin{|—] = 
2 2 


2 The general formula is more complex because the square roots can be either positive or negative 
depending on the quadrant in which a//2 is located. For a triangle 0 < a < 180°, so 0< a/2<90° 
is in the first quadrant and both the sine and the cosine are positive. 
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A.2.4 The Law of Cosines 


Theorem A.8 (Law of cosines) Jn a triangle AABC with sides a, b, c (Fig. A.7): 


Cc =a’ +b? —2abcos ZACB. 


Proof (1) Drop an an altitude from C to AB and use the definition of cosine and 


Pythagoras’s Theorem: 


c=x+(c-x)=acosB+bcosa 


ce = accos p +bccosa. 


Similarly, drop altitudes from A to BC and from B to AC to obtain: 


a” = cacos B + ba cos y 


b? = chcosa+abcosy. 


Adding Eqs. A.5a and A.5b and subtracting Eq. A.4b gives: 


a’ + b? —c* = cacos B + ba cos y 
+ cb cos «œ + ab cos y 
—accos B — bc cos œ 
= 2ab cos y 


c? =a* +b? —2abcosy. 


Fig. A.7 Proof 1 of the Law of Cosines 


(A.4a) 
(A.4b) 


(A.5a) 
(A.5b) 
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Proof (2) The second proof uses Ptolemy’s theorem (Thm. A.18).3 

The triangle AABC can be circumscribed by a circle. Construct another triangle 
AABC’ congruent with AABC and inscribed within the same circle (Fig. A.8). This 
can be done by constructing an angle from AB equal to ZCAB which intersects 
the circle at C’ and then constructing the line C’A. Since angles that are subtended 
by the same chord are equal ZAC’B = ZBCA, so also ZCBA = ZC’ AB and thus 
AABC’ = ABAC by angle-side-angle with the common side AB. 

Drop perpendiculars from C to D and from C’ to D’ on AB so that x = a cos £. 
By Ptolemy’s theorem for the quadrilateral ABCC’: 


b? = a* +c(c — 2x) 
= a’ +c(c — 2acos p) 


=a’ + —2accosB. oO 


Fig. A.8 Proof 2 of the Law of Cosines 


3 Section A.4 uses the Law of Cosines to prove Ptolemy’s theorem! The first proof of the Law of 
Cosines avoids this circular reasoning. Furthermore, there are proofs of Ptolemy’s theorem that do 
not use the Law of Cosines. 
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A.2.5 The Tangent of the Sum of Two Angles 


Theorem A.9 
_ tana +tanp 
a 1 — tan e tan £ ` 
Proof 
sin(a@ + £) 
t = 
atete cos(a + 8) 


sina cos f + cos a sin 8 


cos « cos 8 — sina sin B 
sina + cosa tan 8 


cos aw — sina tan B 
tana + tan 8 
1 — tan q tan 8 ` 


A.2.6 The Tangent of a Half-Angle 


Theorem A.10 
i (2) —1 + V1 + tan? a 
an { — ——. 
2 


tan œ 


Proof We derive and solve a quadratic equation in tan (5): 


tan œ = 


tan a tan” (5) +2tan (5) -tang =0 
: (2) -1+V1+tan2a@ 
an {| = = —— 


2 tana i o 
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A.2.7 The Product of Three Tangents 


Theorem A.11 /fa + 8 +y = 180° then: 
tan œ + tan £ + tan y =tanatanftany. 
Proof 


tan y = tan(180° — (æ + £)) 
= —tan(a + $) 
_ tang+tang 
— 1l-tanatanß 
tan q tan £ tan y = tan «œ + tan £ + tan y . oO 


A.2.8 The Limit of sin a/a@ 


Theorem A.12 
lim sin œ 


a-0 «Q 


=1. 


Proof By examining regular polygons inscribed within a circle (Fig. A.9), we see 
that the more sides that a polygon has, the closer its perimeter is to the circumference 
of the circle. The circumference of the circle divided by the number of sides is the 
length of an arc with the same endpoints as the corresponding side, since in a regular 
polygon all sides have the same length. Since the ratio of the circumference of the 
circle to the perimeter of an inscribed polygon approaches 1 as the number of sides 
increases, so does the ratio of the length of an arc to the corresponding chord. This 
is demonstrated by the following numerical examples: 


Fig. A.9 Regular polygons with 3, 8 and 16 sides inscribed within a circle 
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O 


a 


Fig. A.10 The length of a chord corresponding to an arc of size œ 


Angle Arc length Chord length Ratio 


80 1.396 1.286 1.090 
60 1.047 1.000 1.047 
40 0.698 0.684 1.006 

5 0.087 0.087 1.000 


Since a = b = 1 the length of the chord c subtending œ can be computed from 
the Law of Cosines (Fig. A.10): 


C =a +b? —2ab cosa 


c= V2-2cosa 
lim c = V2-2.1=0. 


a0 


Referring to Fig. A.11: 


. sing . 2sing 
lim = lim ; 
a-0 Q a>0 2a 


This is the ratio of the length of chord PQ to the length of arc PO. But we have seen 
that this ratio converges to | as the subtended angle 2a tends to 0, so: 
sin œ 


lim £1; 
a-0 «Q Oo 


Fig. A.11 Ratio of sin x to x 
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A.3 The Angle Bisector Theorems 


Theorem A.13 In AABC let the angle bisector of ¿BAC intersect BC at D 
(Fig. A.12). Then: 

BD AB 

CD AC 
Proof We prove the theorem by computing the areas of two triangles using both the 
base and height (Eq. A.1), and the base, angle and side (Eq. A.2): 


AABD = 5BDh Z LAB ADsina 
BD AD sina 
AB h 
— —— 
AACD = zP” = zACAD sin œ 


CD AD sina 

AC h 

BD AB 

= Oo 
CD AC 


There is also an angle bisector theorem for the external bisector: 


Theorem A.14 In AABC let AE be the bisector of the angle supplementary to the 
angle ABAC (Fig. A.13) and let the bisector intersect BC at E (Fig. A.12). Then: 


BE AB 
CE AC 
A 
Qa!\ @ 
h 
B $ c 


Fig. A.12 The internal angle bisector theorem 
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E B 


Fig. A.13 The external angle bisector theorem 


Proof Since AC is a straight line ZEAC = 180° — a. 
1— 1—— . 
AABE = _BEn = ie AB sina 


1 1____ l- 
AACE = zCEr = 74E AC sin(180° — œ) = ae AC sina 


BE AEsina [ CE 


AB h AC 


AB 


A.4 Ptolemy’s Theorem 
A.4.1 A Trapezoid Circumscribed by a Circle 


Before giving the proof of Ptolemy’s theorem we prove theorems on quadrilaterals 
and trapezoids. 


Theorem A.15 A quadrilateral can be circumscribed by a circle if and only if the 
opposite angles are supplementary (sum to 180°). 


Geometry textbooks give the simple proof of the forward direction, but it is hard 
to find a proof of the converse so both proofs are given here. 


Proof (Forward direction) An inscribed angle is equal to half the arc that subtends it 
so ZDAB is half of the arc DCB and ZDCB is half of the arc DAB (Fig. A.14a). The 
two arcs form the entire circumference of the circle so their sum is 360°. Therefore, 
ZDAB + ¿DCB = 5 - 360° = 180°, and similarly ZADC + ZABC = 180°. 
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A 
D A 
D 
B 
B 
C 
C 
Fig. A.14a A quadrilateral circumscribed by Fig. A.14b The fourth vertex must be on the 
a circle circumference 


Proof (Converse direction) Any triangle can be circumscribed by a circle. Circum- 
scribe ADAB by a circle and suppose that C’ is a point such that DAB + ZDC’B = 
180°, but C’ is not on the circumference of the circle. Without loss of generality, let 
C’ be within the circle (Fig. A.14b). 

Construct a ray that extends DC’ and let C be its intersection with the circle. 
ABCD is circumscribed by a circle so: 


¿DAB + ZDCB = 180° = ZDAB+ZDC’B 
¿DCB = ZDC'B, 


which is impossible if C is on the circle and C’ is inside the circle. Oo 


Theorem A.16 The opposite angles of an isosceles trapezoid are supplementary. 


Proof Construct the line AB’ parallel to CD (Fig. A.15). AB’CD is a parallelogram 
and AABB’ is an isosceles triangle, so ZC = ZABB’ = ZAB’B = ZB. Similarly, 
ZA = ZD. Since the sum of the internal angles of any quadrilateral is equal to 360°: 


ZA+ZB+2C+ZD = 360° 
2£A +2ZC = 360° 
ZA+ ZC = 180°, 


and similarly ZB + ZD = 180°. o 
Theorem A.17 An isoceles trapezoid can be be circumscribed by a circle. 


The proof is immediate by Thms. A.15, A.16. 
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v 
< 


A 


Fig. A.15 An isoceles trapezoid 


A.4.2 Proof of Ptolemy’s Theorem 


Theorem A.18 (Ptolemy) Given a quadrilateral circumscribed by a circle, the fol- 
lowing formula relates the lengths of the diagonals and the lengths of the sides 
(Fig. A.16). 

ef =acthbd. 


Proof By the Law of Cosines for the four triangles AABC, AADC, ADAB, ADCB: 


e? = a’ + b? — 2ab cos ZB 
e? = c* +d’ —2cdcos LD 
f? =a° +d? -2adcos ZA 
f? =b +c? —2becos /C. 
ZC = 180° — ZA and ZD = 180° — ZB because they are opposite angles of a quadri- 


lateral circumscribed by a circle, so cos ZD = —cos ZB and cos ZC = -cos ZA. 
Eliminate the cosine term from the above equations to obtain: 


Fig. A.16 Ptolemy’s theorem 
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e?(cd +.ab) = abc? + abd? + a’cd + b’cd 
2 _ (ac +bd)(ad + bc) 

ain (ab + cd) 

_ (ab+cd)(ac + bd) 

~ (ad + bc) 


rP 
Multiply the two equations and simplify to get Ptolemy’s theorem: 


e- f? = (ac + bd? 
ef = (ac+bd). o 


A.5 Ceva’s Theorem 


Theorem A.19 (Ceva) Given line segments from the vertices of a triangle to the 
opposite edges that intersect ina point, the lengths of the segments satisfy (Fig. A.17): 
AM BQ SP 
MB QS PA 
Proof If the altitudes of two triangles are equal, their areas are proportional to the 
bases. In both diagrams in Fig. A.18, the altitudes of the gray triangles are equal, so: 
ABQO BQ ABQA BQ 
ASQO OS” ASQA OS 


By subtracting the areas of the indicated triangles, we get the proportion between 
the gray triangles shown in Fig. A.19: 


Fig. A.17 Ceva’s theorem 
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S S 


Q Q 


& h’ 


A B A B 


Fig. A.18 Triangles in Ceva’s theorem 


ABOA ABQA-ABQO BQ 
ASOA ASQA-ASQO Q5` 


This might look strange at first so we explain it using a simpler notation: 


cC a 
d b 
SEE 
f b 
_ad af a 

e p AN 
CE _@ 
d-f b’ 

S 
Q 
A B 


Fig. A.19 Subtracting areas in Ceva’s theorem 
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Similarly, we can prove: 


AM _ AAOS 
MB ABOS 
SP — ASOB 
PA AAOB’ 
so: ee 
AM BQ SP _ AAOS ABOA ASOB _ 
MB OS PA ABOS SOA AAOB 
since the order of the vertices in a triangle makes no difference. o 


A.6 Menelaus’s Theorem 
Theorem A.20 (Menelaus) 

Let AABC be a triangle and DBQ a transversal line that intersects all three of 
the edges of the triangle or their extensions (Fig. A.20). Then: 4 


e =1. (A.6) 


Proof Draw a line through C parallel to AB and extend DQ until it intersects the 
parallel at K. From AADB ~ ACDK it follows that: 


Fig. A.20 Menelaus’s theorem 


4 Depending on the configuration of the triangle and the transversal line, the result of the multipli- 
cation can be either plus or minus one. 
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CD CK 
AD AB 
From ABQP ~ AKQC it follows that: 


QC CK 

PO BP 
Eliminating CK gives AB - CD - PQ = QC - BP - AD which can be re-arranged to 
obtain Thm. A.6. oO 
Sources 


The appendix is based primarily on [19]. Ceva’s theorem and Menelaus’s theorem 
can be proved from each other [45]. 


References 


Wikipedia references are listed by their title only. 


All links were alive on 7 December 2021. 


Aigner, M., Ziegler, G.M.: Proofs from THE BOOK (Fifth Edition). Springer (2014) 
Alperin, R.C.: A mathematical theory of origami constructions and numbers. New York 
Journal of Mathematics 6, 119—133 (2000) 

Barabash, M.: A non-visual counterexample in elementary geometry. The College Mathematics 
Journal 36(5) (2005) 

Ben-Ari, M.: Mathematical Logic for Computer Science (Third Edition). Springer (2012) 
Ben-Ari, M.: LearnSAT: A SAT solver for education. Journal of Open Source Software 3(24), 
639 (2018). https: //doi.org/10.21105/joss.00639 

Bewersdorff, J.: Galois Theory for Beginners: A Historical Perspective. American Mathemat- 
ical Society (2006) 

Bold, B.: Famous Problems of Mathematics: A History of Constructions with Straight Edge 
and Compass. Van Nostrand (1969) 

Bradford, P.V.: Visualizing solutions to n-th degree algebraic equations using 
right-angle geometric paths. Archived May 2, 2010 at the Wayback Ma- 
chine. https://web.archive.org/web/20100502013959/http: //www.concentric. 
net/~pvb/ALG/rightpaths.html (2010) 

Butler IV, L.: Ramsey theory. https://www.whitman.edu/Documents/Academics/ 
Mathematics/2016/Barton. pdf (2016) 


10. Callagy, J.J.: The central angle of the regular 17-gon. The Mathematical Gazette 67(442), 
290-292 (1983). https://www. jstor.org/stable/3617271 

11. Courant, R., Robbins, H.: What is Mathematics? An Elementary Approach to Ideas and 
Methods (Second Edition). Oxford University Press (1996). Revised by Ian Stewart 

12. Davies, R.: On Langford’s problem (ID). The Mathematical Gazette 43, 253-5 (1959) 

13. Dorrie, H.: 100 Problems of Elementary Mathematics: Their History and Solution. Dover 
(1965) 

14. Dorrie, H.: 100 problems of elementary mathematics: Their history and solu- 
tion. Reworked by Michael Woltermann. Archived 21 February 2020 at the 

© The Author(s) 2022 219 


M. Ben-Ari, Mathematical Surprises, https://doi.org/10.1007/978-3-03 1-13566-8 


220 References 


Wayback Machine. https://web.archive.org/web/20191223032114/http: //www2. 
washjeff.edu/users/mwoltermann/Dorrie/DorrieContents.htm (2010) 

15. Dudley, U.: A Budget of Trisections. Springer (1987) 

16. Eppstein, D.: Twenty proofs of Euler’s formula: V — E + F = 2. https://www.ics.uci. 
edu/~eppstein/junkyard/euler/ (n.d.) 

17. Fraleigh, J.B.: A First Course in Abstract Algebra (Seventh Edition). Addison-Wesley (2003) 

18. Gauss, K.F.: Disquisitiones Arithmeticae. Yale University Press (2006). Editors: Todd W. 
Bressi and Paul Groth 

19. Gelfand, I., Saul, M.: Trigonometry. Springer (2001) 

20. Graham, R., Butler, S.: Rudiments of Ramsey Theory (Second Edition). American Mathemat- 
ical Society (2015) 

21. Gunderson, D.S.: Handbook of Mathematical Induction: Theory and Applications. Mathemat- 
ical Association of America (2010) 

22. Heath, T.L.: The Thirteen Books of Euclid’s Elements. Dover (1956) 

23. Heule, M.J.H., Kullmann, O.: The science of brute force. Communications of the ACM 60(8), 
70-79 (2017) 

24. Hull, T.C.: Solving cubics with creases: The work of Beloch and Lill. American Mathematical 
Monthly 118(4), 307-315 (2011) 

25. Hungerbiihler, N.: A short elementary proof of the Mohr-Mascheroni theorem. American 
Mathematical Monthly 101(8), 784-787 (1994) 

26. Lang, R.J.: Origami and geometric constructions. http: //langorigami.com/ 
wp-content/uploads/2015/09/origami_constructions.pdf (1996-—2015) 

27. Laugwitz, D.: Eine elementare Methode fiir Unmoglichkeitsbeweise bei Konstruktionen mit 
Zirkel und Lineal. Elemente der Mathematik 17, 54-58 (1962) 

28. Loh, P.S.: A different way to solve quadratic equations. https: //www.poshenloh.com/ 
quadratic/ (2019) 

29. Loh, P.S.: A simple proof of the quadratic formula. https: //arxiv.org/abs/1910.06709 
(2019) 

30. Manna, Z.: Mathematical Theory of Computing. McGraw-Hill (1974) 

31. Martin, G.E.: Geometric Constructions. Springer (1998) 

32. Mastin, L.: Muhammad ibn Musa al-Khwarizmi: Muslim Mathematician. https://www. 

storyofmathematics.com/islamic_alkhwarizmi .html (2020) 

33. McCallum, W.: A tale of two triangles: Heron triangles and elliptic curves. http://blog. 

kleinproject.org/?p=4 (2012) 

34. McKay, B.D.: Ramsey theory. http://users.cecs.anu.edu.au/~bdm/data/ramsey. 

html (nd) 

35. Miller, J.: Langford’s problem, remixed. http: //dialectrix.com/langford. html (2014) 

36. Newton, L.: The power of origami. https: //plus.maths.org/content/power- origami 

37. Peil, T.: The rusty compass theorem. Archived 20/07/2020 at the Wayback Ma- 
chine. https: //web. archive. org/web/20200720195718/http://web.mnstate.edu/ 
peil/geometry/C2EuclidNonEuclid/1Compass.htm (2006) 

38. Ramanujan: Squaring the circle. Journal of the Indian Mathematical Society V, 138 (1913). 
http: //ramanujan.sirinudi.org/Volumes/published/ram05.pdf 

39. Ramanujan: Modular equations and approximations to 7. The Quarterly Journal of Pure 
and Applied Mathematics XLV, 350-372 (1914). http://ramanujan.sirinudi.org/ 
Volumes/published/ram06.pdf 

40. Riaz, M.: Geometric solutions of algebraic equations. American Mathematical Monthly 69(7), 
654-658 (1962) 

41. Rike, T.: Fermat numbers and the heptadecagon. https://mathcircle.berkeley.edu/ 
sites/default/files/BMC6/ps0506/Heptadecagon. pdf (2005) 


References 221 


42. 


43. 
44. 


45. 
46. 


47. 
48. 
49. 


50. 


51. 
52. 
53. 
54. 
55. 
56. 
57. 
58. 
39, 
60. 
6l. 
62. 
63. 


Robson, E.: Words and pictures: New light on Plimpton 322. American Mathematical Monthly 
109(2), 105-120 (2002) 

Ross, S.: A First Course in Probability (Tenth Edition). Pearson (2019) 

Schumer, P.: The Josephus problem: Once more around. Mathematics Magazine 75(1), 12-17 
(2002) 

Silvester, J.R.: Ceva = (Menelaus)?. The Mathematical Gazette 84(500), 268-271 (2000) 
Sipka, T.: Alfred Bray Kempe’s “Proof” of the four-color theorem. Math Horizons 10(2), 
21-26 (2002). http://www. jstor.org/stable/25678395 

Stillwell, J.: Mathematics and Its History (Third Edition). Springer (2010) 

Suzuki, J.: A brief history of impossibility. Mathematics Magazine 81(1), 27—38 (2008) 
Thomas, R.: An update on the four-color theorem. Notices of the AMS 45(7), 848-859 (1998). 
http://www.ams.org/notices/199807/thomas.pdf 

Toussaint, G.: A new look at Euclid’s second proposition. The Mathematical Intelligencer 
15(3), 12-23 (1993) 

Wikipedia: Angle trisection 

Wikipedia: Cubic equation 

Wikipedia: Five color theorem 

Wikipedia: Four color theorem 

Wikipedia: Heptadecagon 

Wikipedia: Huzita—Hatori axioms 

Wikipedia: Josephus problem 

Wikipedia: Neusis construction 

Wikipedia: Pentagon 

Wikipedia: Plimpton 322 

Wikipedia: Quadratic equation 

Wikipedia: Quadratrix of Hippias 

Wikipedia: Sexagesimal 


Index 


A 


Abe, Hisashi 141 
Alperin, Roger C. 150 
Appel, Kenneth 41 
Archimedes 11, 16 


B 


Babylonian mathematics 100 
Beloch fold 138 
Beloch, Margherita P. 
Binet’s formula 64 
Burstall, Rod M. 72 


113, 138, 141 


C 


Callagy, James J. 183 

Cardano, Gerolamo 79 

Carlyle circle 84 

Ceva’s theorem 165,215 

Circumscribed circle 204, 212-214 
around a trapezoid 213 

Collapsing compass 1 


construction of a perpendicular bisector 
construction of an equilateral triangle 


Coloring 
of a planar graph 42 
ofaplanarmap 41 
of a polygon 55 
Constructible number 21 
arithmetic operation 21 
depth of square roots 24 
square root 22 


© The Author(s) 2022 


Construction 
with only a compass 151-161 
of aregular heptadecagon 192 
of aregular pentagon 196 
Cubic equation 79 


D 


de Moivre’s formula 185 
Degree of a vertex 46 
Directrix of a parabola 121 
Doubling a cube 

impossibility of 26 

with aneusis 18 

using origami 144, 147 


E 


Elliptic curve 178 
Euclid’s Elements 3 
Euclid’s formula 100 
Euler’s formula 44 
Euler, Leonhard 66 


F 


Fary’s theorem 43 

66, 184 
Fermat, Pierre de 66 
Fibonacci numbers 63 
Five-color theorem 48 
Floating-point underflow 85 
Focus of a parabola 121 
Four-color theorem 50 
Fundamental theorem of algebra 


Fermat numbers 


M. Ben-Ari, Mathematical Surprises, https://doi.org/10.1007/978-3-03 1-13566-8 


25, 185 


223 


224 
G 


Galois, Evariste 2 

Gauss, Carl Friedrich 183 
Gauss-Wantzel theorem 184 
Geometric locus 113 
Guthrie, Francis 41 


H 


Haken, Wolfgang 41 
Hatori, Koshiro 113 
Heath, ThomasL. 9 
Heawood, Percy J. 41 
Heptadecagon 183 

construction 192 

cosine of the central angle 184 
Hermes, Johann Gustav 184 


Heron’s formula see Triangle, Heron’s 


formula 
Heule, Marijn J.H. 91,98, 102 
Hippias 11,19 
Huzita, Humiaki 113 


I 

Inscribed circle 175 
J 

Josephus problem 69 
K 


K3 3 is not planar 45 

Ks is not planar 45 

Kempe chain 48 

Kempe, Alfred B. 41 

Klee, Victor 53 

Kochański, Adam 30 
Kullman, Oliver 91,98, 102 
Kuratowski Kazimierz 45 


L 


Langford’s problem 105 
as a covering problem 106 
solution of L(4) 110 
solvability, conditions for 107 
Langford, C. Dudley 105 


Law of cosines 
214 
Lill’s method 131 
algorithm 133 
cube root of two 137 
multiple roots 132 
negative coefficients 134 
non-integer roots 136 
paths that do not lead to roots 
proof of 137 
quadratic equations 83 
zero coefficients 135 
Lindemann, Carl von 29 
Loh, Poh-Shen 73 


M 


Martin, George E. 141 
Mascheroni, Lorenzo 151 
Mathematical induction 61 
McCarthy’s 91-function 67 
McCarthy, John 67 

Menelaus’s theorem 217 
Messer, Peter 141 

Mohr, Georg 151 
Mohr-Mascheroni theorem 151 


187 
Museum 


guarda 53 
and triangulated polygons 56 


N 


Neusis 
doubling of a cube 18 
trisection of an angle 16 


132 


Index 


18, 36, 172, 175, 206, 210, 


Monic polynomial 24, 25, 27, 74, 78, 137, 


Nonagon see Origami, construction of a 


nonagon 
(0) 


Origami 
angle bisector 117 
axiom 1 114 
axiom2 115 
axiom3 116 
axiom4 118 
axiom 5 119 
axiom 6 121 


Index 


axiom 7 128 

construction of anonagon 148 
doubling acube 144, 147 
geometric constructions 141 
perpendicular bisector 115 
reflection 113 

trisection of anangle 141, 143 


P 


Parabola 
common tangents to two parabolas 121 
cubic equation for the common tangents 
125 
folds of Axiom 6 are tangents 127 
as the locus of a fold 121 
Pascal’s rule 65 
Paucker, Magnus Georg 184 
Paul Erdős 94 
Peano axioms 63 
Planar 
graph 42 
map 41 
Plimpton 322 100 
Polygon 
convex and concave vertices 55 
triangulated 54,57 
Poncelet, Jean-Victor 163 
Probabilistic method 94 
Propositional logic 96 
Ptolemy’s theorem 204, 212 
Pythagorean triples 91, 98, 100 
primitive 100 


Q 


Quadratic equation 73 
completing the square 78 
numerical computation of the roots 85 
roots of 74 
traditional formula 78 

Quadratrix 19 
compass 20 
squaring a circle 38 

Quadrilateral 
inscribed in a circle 204 

Quartic equation 77 


R 


Ramanujan 32,35 


225 


Ramsey’s theorem 93 

lower bound 94 

lower bound for 94 

proof that R(3)=6 93 
Ramsey, Frank P. 93, 102 
Regular polygon 

cosine of the central angle 184 
Richelot, Friedrich Julius 184 
Roots 

of cubic polynomials 24-25 

of unity 185 
Round-off error 85 


S 


SAT solver 96 
DPLL algorithm 99 
unit propagation 99 
Schur triples 89, 97 
Sexagesimal number system 101 
Six-color theorem 48 
Squaring a circle 
approximation 30, 32, 35 
with a quadratrix 38 
Steiner, Jakob 163 


T 


Trapezoid, isoceles 154,213 
Triangle 
angle bisector theorem 211 
computing the area 199-201 
Heron’s formula 175, 200 
incenter 175,200 
inscribed circle 200 
all are isoceles 7 
same area and perimeter 175 
Triangulated graph 42 
Trigonometric identities 202-210 
cosine of a triple angle 205 
limit of sina/a@ 209 
product of three tangents 209 
sine and cosine of a half-angle 205 
sine and cosine of the sum and difference of 
two angles 202 
tangent of a half-angle 208 
tangent of the sum of two angles 208 
Trisection of an angle 
approximation 11,14 
impossibility of 26 


226 


with aneusis 16 
using origami 141, 143 
with a quadratrix 19 

Two secants, method of 178 


v 


van der Waerden’s problem 92 


van der Waerden, Bartel L. 


Wantzel, Pierre 


11 


92 


Index 


